Re: Upgrading from 2.8.0 to 3.5.1

2023-10-04 Thread Josep Prat
Hi Chester,

If I'm not mistaken, what it means is to update the Kafka version on the
broker.
Broker with version 2.8.0 -> Shutdown -> Replace jars and binaries so it
contains Kafka 3.5.1 -> Start it

Best,

On Wed, Oct 4, 2023 at 9:07 AM Walchester Gaw  wrote:

> Hello.
>
> I would like to seek clarification on what it means to "update the code"
> in your tutorial on
> how to upgrade. How do I update the code? Which code should I update?
>
> [image: image.png]
>
> Thanks,
> Chester
>


-- 
[image: Aiven] 

*Josep Prat*
Open Source Engineering Director, *Aiven*
josep.p...@aiven.io   |   +491715557497
aiven.io    |   
     
*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
Amtsgericht Charlottenburg, HRB 209739 B


Re: [VOTE] KIP-951: Leader discovery optimisations for the client

2023-10-04 Thread Mayank Shekhar Narula
Thank you all for your votes, Jun, David, and Jason!

On Tue, Oct 3, 2023 at 11:44 PM Jason Gustafson 
wrote:

> +1 Thanks for the KIP
>
> On Tue, Oct 3, 2023 at 12:30 PM David Jacot  wrote:
>
> > Thanks for the KIP. +1 from me as well.
> >
> > Best,
> > David
> >
> > Le mar. 3 oct. 2023 à 20:54, Jun Rao  a écrit
> :
> >
> > > Hi, Mayank,
> > >
> > > Thanks for the detailed explanation in the KIP. +1 from me.
> > >
> > > Jun
> > >
> > > On Wed, Sep 27, 2023 at 4:39 AM Mayank Shekhar Narula <
> > > mayanks.nar...@gmail.com> wrote:
> > >
> > > > Reviving this thread, as the discussion thread has been updated.
> > > >
> > > > On Fri, Jul 28, 2023 at 11:29 AM Mayank Shekhar Narula <
> > > > mayanks.nar...@gmail.com> wrote:
> > > >
> > > > > Thanks Jose.
> > > > >
> > > > > On Thu, Jul 27, 2023 at 5:46 PM José Armando García Sancio
> > > > >  wrote:
> > > > >
> > > > >> The KIP LGTM. Thanks for the design. I am looking forward to the
> > > > >> implementation.
> > > > >>
> > > > >> +1 (binding).
> > > > >>
> > > > >> Thanks!
> > > > >> --
> > > > >> -José
> > > > >>
> > > > >
> > > > >
> > > > > --
> > > > > Regards,
> > > > > Mayank Shekhar Narula
> > > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > Mayank Shekhar Narula
> > > >
> > >
> >
>


-- 
Regards,
Mayank Shekhar Narula


[PR] MINOR: document how we deal with advisories for dependencies [kafka-site]

2023-10-04 Thread via GitHub


raboof opened a new pull request, #554:
URL: https://github.com/apache/kafka-site/pull/554

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-04 Thread Bruno Cadonna

Hi Hanyu,

I agree with what others said about having a `withDescendingOrder()` 
method and about to document how the results are ordered.


I would not add a reverse flag and adding a parameter to each method in 
RangeQuery. This makes the API less fluent and harder to maintain since 
the flag would change all methods. There is no constraint to only add 
static factory methods to RangeQuery. In fact, if you look into the 
existing class KeyQuery, more precisely at skipCache() and into the 
proposals for queries of versioned state stores, i.e., KIP-969, KIP-968, 
and KIP-960, we already have examples where we set a flag with a 
instance method, for example, asOf(). Such methods make the API more 
fluent and limit the blast radius of the flag to only one public method 
(plus the getter).


So, making a query that reads the state store in reversed order would 
then result in:


final RangeQuery query = RangeQuery.withRange(1, 
1000).withDescendingKeys();


I think this is more readable than:

final RangeQuery query = RangeQuery.withRange(1, 1000, 
true);


Additionally, I think the KIP would benefit from a usage example of the 
newly introduced methods like in KIP-969 etc.


In my opinion, the test plan should also mention that you plan to 
write/adapt unit tests.


Best,
Bruno

On 10/4/23 5:16 AM, Hanyu (Peter) Zheng wrote:

If we use  WithDescendingKeys() to generate a RangeQuery to do the
reveseQuery, how do we achieve the methods like withRange, withUpperBound,
and withLowerBound only in this method?

On Tue, Oct 3, 2023 at 8:01 PM Hanyu (Peter) Zheng 
wrote:


I believe there's no need to introduce a method like WithDescendingKeys().
Instead, we can simply add a reverse flag to RangeQuery. Each method within
RangeQuery would then accept an additional parameter. If the reverse is set
to true, it would indicate the results should be reversed.

Initially, I introduced a reverse variable. When set to false, the
RangeQuery class behaves normally. However, when reverse is set to true,
the RangeQuery essentially takes on the functionality of ReverseRangeQuery.
Further details can be found in the "Rejected Alternatives" section.

In my perspective, RangeQuery is a class responsible for creating a series
of RangeQuery objects. It offers methods such as withRange, withUpperBound,
and withLowerBound, allowing us to generate objects representing different
queries. I'm unsure how adding a withDescendingOrder() method would be
compatible with the other methods, especially considering that, based on
KIP 969, WithDescendingKeys() doesn't appear to take any input variables.
And if withDescendingOrder() doesn't accept any input, how does it return a
RangeQuery?

On Tue, Oct 3, 2023 at 4:37 PM Hanyu (Peter) Zheng 
wrote:


Hi, Colt,
The underlying structure of inMemoryKeyValueStore is treeMap.
Sincerely,
Hanyu

On Tue, Oct 3, 2023 at 4:34 PM Hanyu (Peter) Zheng 
wrote:


Hi Bill,
1. I will update the KIP in accordance with the PR and synchronize their
future updates.
2. I will use that name.
3. you mean add something about ordering at the motivation section?

Sincerely,
Hanyu


On Tue, Oct 3, 2023 at 4:29 PM Hanyu (Peter) Zheng 
wrote:


Hi, Walker,

1. I will update the KIP in accordance with the PR and synchronize
their future updates.
2. I will use that name.
3. I'll provide additional details in that section.
4. I intend to utilize rangeQuery to achieve what we're referring to as
reverseQuery. In essence, reverseQuery is merely a term. To clear up any
ambiguity, I'll make necessary adjustments to the KIP.

Sincerely,
Hanyu



On Tue, Oct 3, 2023 at 4:09 PM Hanyu (Peter) Zheng 
wrote:


Ok, I will change it back to following the code, and update them
together.

On Tue, Oct 3, 2023 at 2:27 PM Walker Carlson
 wrote:


Hello Hanyu,

Looking over your kip things mostly make sense but I have a couple of
comments.


1. You have "withDescandingOrder()". I think you mean "descending"
:)
Also there are still a few places in the do where its called
"setReverse"
2. Also I like "WithDescendingKeys()" better
3. I'm not sure of what ordering guarantees we are offering.
Perhaps we
can add a section to the motivation clearly spelling out the
current
ordering and the new offering?
4. When you say "use unbounded reverseQuery to achieve reverseAll"
do
you mean "use unbounded RangeQuery to achieve reverseAll"? as far
as I can
tell we don't have a reverseQuery as a named object?


Looking good so far

best,
Walker

On Tue, Oct 3, 2023 at 2:13 PM Colt McNealy 
wrote:


Hello Hanyu,

Thank you for the KIP. I agree with Matthias' proposal to keep the

naming

convention consistent with KIP-969. I favor the

`.withDescendingKeys()`

name.

I am curious about one thing. RocksDB guarantees that records

returned

during a range scan are lexicographically ordered by the bytes of

the keys

(either ascending or descending order, as specified in the query).

This

means that results within a single partition 

Upgrading from 2.8.0 to 3.5.1

2023-10-04 Thread Walchester Gaw
Hello.

I would like to seek clarification on what it means to "update the code" in
your tutorial on how
to upgrade. How do I update the code? Which code should I update?

[image: image.png]

Thanks,
Chester


Re: [PR] MINOR: document how we deal with advisories for dependencies [kafka-site]

2023-10-04 Thread via GitHub


divijvaidya commented on code in PR #554:
URL: https://github.com/apache/kafka-site/pull/554#discussion_r1345468001


##
project-security.html:
##
@@ -35,6 +35,22 @@ Kafka security

 For a list of security issues fixed in released 
versions of Apache Kafka, see CVE list.

+   Advisories for dependencies
+   
+   Many organizations use 'security scanning' tools to 
detect components for which advisories exist. While we generally encourage 
using such tools, since they are an important way users are notified of risks, 
our experience is that they produce a lot of false positives: when a dependency 
of Kafka contains a vulnerability, it is likely Kafka is using it in a way that 
is not affected. As such, we do not consider the fact that an advisory has been 
published for a Kafka dependency sensitive. Only when additional analysis 
confirms Kafka is affected by the problem, we ask you to report this finding 
privately through mailto:secur...@kafka.apache.org?Subject=[SECURITY] 
My security issue" target="_top">secur...@kafka.apache.org.

Review Comment:
   I would be more comfortable if we add something like "if you are unsure 
about impact to Kafka, err on the side of reporting". Asking this because users 
in the community may not be familiar with breadth of components in Kafka code 
base to make a call on impact.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [VOTE] KIP-951: Leader discovery optimisations for the client

2023-10-04 Thread Mayank Shekhar Narula
Summarising, there are 5 binding votes(Luke, Jose, Jun, David, Jason), and
1 non-binding vote(Kirk).

With the current status of voting, KIP is accepted.

Thanks again to all reviewers and voters.



On Wed, Oct 4, 2023 at 9:37 AM Mayank Shekhar Narula <
mayanks.nar...@gmail.com> wrote:

> Thank you all for your votes, Jun, David, and Jason!
>
> On Tue, Oct 3, 2023 at 11:44 PM Jason Gustafson 
> wrote:
>
>> +1 Thanks for the KIP
>>
>> On Tue, Oct 3, 2023 at 12:30 PM David Jacot 
>> wrote:
>>
>> > Thanks for the KIP. +1 from me as well.
>> >
>> > Best,
>> > David
>> >
>> > Le mar. 3 oct. 2023 à 20:54, Jun Rao  a
>> écrit :
>> >
>> > > Hi, Mayank,
>> > >
>> > > Thanks for the detailed explanation in the KIP. +1 from me.
>> > >
>> > > Jun
>> > >
>> > > On Wed, Sep 27, 2023 at 4:39 AM Mayank Shekhar Narula <
>> > > mayanks.nar...@gmail.com> wrote:
>> > >
>> > > > Reviving this thread, as the discussion thread has been updated.
>> > > >
>> > > > On Fri, Jul 28, 2023 at 11:29 AM Mayank Shekhar Narula <
>> > > > mayanks.nar...@gmail.com> wrote:
>> > > >
>> > > > > Thanks Jose.
>> > > > >
>> > > > > On Thu, Jul 27, 2023 at 5:46 PM José Armando García Sancio
>> > > > >  wrote:
>> > > > >
>> > > > >> The KIP LGTM. Thanks for the design. I am looking forward to the
>> > > > >> implementation.
>> > > > >>
>> > > > >> +1 (binding).
>> > > > >>
>> > > > >> Thanks!
>> > > > >> --
>> > > > >> -José
>> > > > >>
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Regards,
>> > > > > Mayank Shekhar Narula
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > > Regards,
>> > > > Mayank Shekhar Narula
>> > > >
>> > >
>> >
>>
>
>
> --
> Regards,
> Mayank Shekhar Narula
>


-- 
Regards,
Mayank Shekhar Narula


Re: [DISCUSS] KIP-980: Allow creating connectors in a stopped state

2023-10-04 Thread Yash Mayya
Hi Chris,

Thanks for the quick follow up and the continued insightful discourse!

1. Fair point on the need to differentiate it from the actual state
displayed in the status API, I like the prefix of "initial" to make that
differentiation (from your suggested alternatives previously). Regarding
the dots vs underscores as delimiters - the new state field will be a top
level field in the connector creation request body alongside the "config"
map (i.e. it won't be a connector configuration itself), so I think we
should be using the underscore delimiter for consistency. For now, I've
updated the KIP to use "initial_state" as the new field's name - let me
know if you disagree, and I'd be happy to reconsider.

2. Hm, I actually hadn't considered the downgrade implications with your
proposed single record approach. I agree that it's a bigger downside than
writing two records to the config topic. I do understand your concerns with
the potential for config topic inconsistencies which is why I proposed
writing the target state first (since the presence of a target state for a
connector with no configuration is a benign condition). Also, even in the
non-transactional config topic producer case - if there is a failure
between the two writes, the user will be notified of the error
synchronously via the API response (ref -
https://github.com/apache/kafka/pull/12984) and will be able to safely
retry the operation. I don't see how we'd be able to do a single record
write approach along with supporting clean downgrades since we'd either
need to introduce a new record type or add a new field to an existing
record type - neither of which would be recognized as such by an older
Connect worker.

> Standalone mode has always supported the REST API,
> and so far FWICTwe've maintained feature parity between
> the two modes

> add support for JSON files with standalone mode.

3. Thanks, I wasn't aware about standalone mode always having supported the
full REST API - I thought I'd seen some references earlier indicating
otherwise. In that case, I do agree that it makes sense to maintain parity
across both methods of connector creation for user experience consistency.
I really like the idea of updating the standalone mode CLI to be able to
parse JSON files (in the same format as the connector creation REST API
endpoint request body) along with Java properties files since I think that
offers two big benefits. One is that users will be able to copy and use
examples across both the methods of connector creation (REST API requests
with JSON request bodies and JSON files passed to the standalone mode
startup CLI). The second benefit is that any future extensions (such as the
"offsets" field we've discussed in this thread) would be easily applied
across both the methods consistently instead of introducing new (and likely
ugly) CLI flags. I've updated the KIP to include this change in the
standalone mode CLI.

4. Makes sense, I've added this under a new "Future Work" section in the
KIP.

6. From what I can tell, there shouldn't be any issues with the lack of
task configurations in the config topic and it seems to be a supported
assumption across the Connect code-base that a connector configuration
could exist without any task configurations for the connector (a situation
that could currently manifest with slow starting connectors, connectors
that fail during startup, connectors that fail to generate task
configurations, connectors that are paused right after being created etc.).
I did also try out a small prototype before publishing this KIP and things
do work as expected when creating a connector in the PAUSED / STOPPED state
by simply writing the appropriate target state along with the connector
configuration to the config topic.

Thanks,
Yash

On Wed, Oct 4, 2023 at 1:44 AM Chris Egerton 
wrote:

> Hi Yash,
>
> Thanks for the in-depth discussion! Continuations here:
>
> 1. Regarding delimiters (dots vs. underscores), we use dots in connector
> configs for all runtime-recognized properties, including but not limited to
> connector.class, tasks.max, key.converter, key.converter.*, and
> transforms.*.type. Regarding the choice of name--I think it's because the
> concept here is different from the "state" that we display in the status
> API that a different name is warranted. Users can't directly write the
> connector's actual state, they can still only specify a target state. And
> there's no guarantee that a connector will ever enter a target state
> (whether it's specified at creation time or later), since a failure can
> always occur.
>
> 2. It still seems simpler to emit one record instead of two, especially
> since we're not guaranteed that the leader will be using a transactional
> producer. I guess I'm just wary of knowingly putting the config topic in an
> inconsistent state (target state present without accompanying connector
> config), even if it's meant to be only for a brief period. Thinking about
> it some more though, a 

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-10-04 Thread Bruno Cadonna

Hi,

Regarding tombstones:
As far as I understand, we need to add either a validTo field to 
VersionedRecord or we need to return tombstones, otherwise the result is 
not complete, because users could never know a record was deleted at 
some point before the second non-null value was put.
I like more adding the validTo field since it makes the result more 
concise and easier interpretable.


Extending on Victoria's example, with the following puts

put(k, v1, time=0)
put(k, null, time=5)
put(k, null, time=10)
put(k, null, time=15)
put(k, v2, time=20)

the result with tombstones would be

value, timestamp
(v1, 0)
(null, 5)
(null, 10)
(null, 15)
(v2, 20)

instead of

value, timestamp, validTo
(v1, 0, 5)
(v2, 20, null)

The benefit of conciseness would already apply to one single tombstone.

On the other hand, why would somebody write consecutive tombstones into 
a versioned state store? I guess if somebody does that on purpose, then 
there should be a way to retrieve each of those tombstones, right?
So maybe we need both -- validTo field and the option to return 
tombstones. The latter might be moved to a future KIP in case we see the 
need.



Regarding .within(fromTs, toTs):
I would keep it simple with .from() and .asOfTimestamp() (or .until()). 
If we go with .within(), I would opt for .withinTimeRange(fromTs, toTs), 
because the query becomes more readable:


MultiVersionedKeyQuery
  .withKey(1)
  .withinTimeRange(Instant.parse(2023-08-03T10:37:30.00Z), 
Instant.parse(2023-08-04T10:37:30.00Z))


If we stay with .from() and .until(), we should consider .fromTime() and 
.untilTime() (or .toTime()):


MultiVersionedKeyQuery
 .withKey(1)
 .fromTime(Instant.parse(2023-08-03T10:37:30.00Z))
 .untilTime(Instant.parse(2023-08-04T10:37:30.00Z))



Regarding asOf vs. until:
I think asOf() is more used in point in time queries as Walker mentioned 
where this KIP specifies a time range. IMO asOf() fits very well with 
KIP-960 where one version is queried, but here I think .until() fits 
better. That might just be a matter of taste and in the end I am fine 
with both as long as it is well documented.



Regarding getters without "get":
In the other IQv2 classes we used getters with "get". In general, we 
tend to move away from using getters without "get", recently. So I would 
keep the "get".



Best,
Bruno

On 10/3/23 7:49 PM, Walker Carlson wrote:

Hey Alieh thanks for the KIP,

Weighing in on the AsOf vs Until debate I think either is fine from a
natural language perspective. Personally AsOf makes more sense to me where
until gives me the idea that the query is making a change. It's totally a
connotative difference and not that important. I think as of is pretty
frequently used in point of time queries.

Also for these methods it makes sense to drop the "get" We don't
normally use that in getters

* The key that was specified for this query.
*/
   public K getKey();

   /**
* The starting time point of the query, if specified
*/
   public Optional getFromTimestamp();

   /**
* The ending time point of the query, if specified
*/
   public Optional getAsOfTimestamp();

Other than that I didn't have too much to add. Overall I like the direction
of the KIP and think the funcatinlyt is all there!
best,
Walker



On Mon, Oct 2, 2023 at 10:46 PM Matthias J. Sax  wrote:


Thanks for the updated KIP. Overall I like it.

Victoria raises a very good point, and I personally tend to prefer (I
believe so does Victoria, but it's not totally clear from her email) if
a range query would not return any tombstones, ie, only two records in
Victoria's example. Thus, it seems best to include a `validTo` ts-field
to `VersionedRecord` -- otherwise, the retrieved result cannot be
interpreted correctly.

Not sure what others think about it.

I would also be open to actually add a `includeDeletes()` (or
`includeTombstones()`) method/flag (disabled by default) to allow users
to get all tombstone: this would only be helpful if there are two
consecutive tombstone though (if I got it right), so not sure if we want
to add it or not -- it seems also possible to add it later if there is
user demand for it, so it might be a premature addition as this point?


Nit:


the public interface ValueIterator is used


"is used" -> "is added" (otherwise it sounds like as if `ValueIterator`
exist already)



Should we also add a `.within(fromTs, toTs)` (or maybe some better
name?) to allow specifying both bounds at once? The existing
`RangeQuery` does the same for specifying the key-range, so might be
good to add for time-range too?



-Matthias


On 9/6/23 5:01 AM, Bruno Cadonna wrote:

In my last e-mail I missed to finish a sentence.

"I think from a KIP"

should be

"I think the KIP looks good!"


On 9/6/23 1:59 PM, Bruno Cadonna wrote:

Hi Alieh,

Thanks for the KIP!

I think from a KIP

1.
I propose to throw an IllegalArgumentException or an
IllegalStateException for meaningless combinations. In any case, the
KIP should specify 

Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2254

2023-10-04 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 315741 lines...]

Gradle Test Run :streams:test > Gradle Test Executor 87 > TaskAndActionTest > 
shouldThrowIfRemoveTaskActionIsCreatedWithNullTaskId() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
onlyRemovePendingTaskToSuspendShouldRemoveTaskFromPendingUpdateActions() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
onlyRemovePendingTaskToSuspendShouldRemoveTaskFromPendingUpdateActions() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldOnlyKeepLastUpdateAction() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldOnlyKeepLastUpdateAction() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldAddAndRemovePendingTaskToRecycle() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldAddAndRemovePendingTaskToRecycle() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
onlyRemovePendingTaskToUpdateInputPartitionsShouldRemoveTaskFromPendingUpdateActions()
 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
onlyRemovePendingTaskToUpdateInputPartitionsShouldRemoveTaskFromPendingUpdateActions()
 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldVerifyIfPendingTaskToRecycleExist() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldVerifyIfPendingTaskToRecycleExist() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldAddAndRemovePendingTaskToUpdateInputPartitions() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldAddAndRemovePendingTaskToUpdateInputPartitions() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
onlyRemovePendingTaskToCloseDirtyShouldRemoveTaskFromPendingUpdateActions() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
onlyRemovePendingTaskToCloseDirtyShouldRemoveTaskFromPendingUpdateActions() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldAddAndRemovePendingTaskToSuspend() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldAddAndRemovePendingTaskToSuspend() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldVerifyIfPendingTaskToInitExist() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldVerifyIfPendingTaskToInitExist() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
onlyRemovePendingTaskToCloseCleanShouldRemoveTaskFromPendingUpdateActions() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
onlyRemovePendingTaskToCloseCleanShouldRemoveTaskFromPendingUpdateActions() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldDrainPendingTasksToCreate() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldDrainPendingTasksToCreate() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldKeepAddedTasks() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > TasksTest > 
shouldKeepAddedTasks() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > StateQueryResultTest 
> More than one query result throws IllegalArgumentException STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > StateQueryResultTest 
> More than one query result throws IllegalArgumentException PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > StateQueryResultTest 
> Zero query results shouldn't error STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 > StateQueryResultTest 
> Zero query results shouldn't error PASSED

Gradle Test Run :streams:test > Gradle Test Executor 87 > StateQueryResultTest 
> Valid query results still works STARTED

Gradle Test Run :streams:test > Gradle Test Executor 87 

Re: [PR] MINOR: document how we deal with advisories for dependencies [kafka-site]

2023-10-04 Thread via GitHub


bmscomp commented on PR #554:
URL: https://github.com/apache/kafka-site/pull/554#issuecomment-1746451332

   Thanks @raboof  , can you put a description for detailing the purpose of 
this pull request ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: document how we deal with advisories for dependencies [kafka-site]

2023-10-04 Thread via GitHub


divijvaidya commented on PR #554:
URL: https://github.com/apache/kafka-site/pull/554#issuecomment-1746461093

   cc: @mimaison @ijuma @showuon, I would also solicit your opinion on this one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [DISCUSS] KIP-986: Cross-Cluster Replication

2023-10-04 Thread Viktor Somogyi-Vass
Hi Greg,

Thanks for the answers. I think they all make sense.

Another point I realized last evening is that now that tiered storage (TS)
is available, it might complicate things with CCR. What I'm thinking of is
that if you have multiple clusters in multiple regions, enabling the object
storage's replication between zones could be much more cost efficient than
replicating local+remote offsets through Kafka. You'd only need to copy
local segments over and remote partition replication would be done by the
remote layer. Or the user could simply choose to not replicate remote
segments between regions but instead just reference them (so that the
backup cluster's remote offsets point to the original region). These
options however likely require bigger coordination between clusters than in
pre-TS Kafka. Do you think we should take this into consideration in the
design and in the UX?

Thanks,
Viktor

On Tue, Oct 3, 2023 at 6:30 PM Greg Harris 
wrote:

> Hi Viktor,
>
> Thanks for your questions! I agree, replication is very fundamental in
> Kafka, so it's been implemented in many different ways by different
> people. I hope that this is the last implementation we'll need, but
> every software engineer says that :)
>
> GT-1: I think as this KIP is very focused on the UX of the feature,
> that user stories are appropriate to include. I think it isn't
> necessary to explain how the different applications are accomplished
> with MM2 or other solutions, but describing what they will look like
> after this KIP would be a wonderful addition. +1
>
> MM2-1: I think that replacing the consumer is insufficient, as we need
> a more expressive producer as well. This is not possible within the
> design constraints of MM2 as a Connector, as MM2 uses the
> connect-managed producer. This could be implemented in MM3 as a new
> process that can use more expressive "internal clients", but then
> we've thrown away the Connect runtime that made MM2 easier to run for
> some users.
> MM2-2: This is technically possible, but sounds operationally hazardous to
> me.
> MM2-3: From the user perspective, I believe that CCR can be made more
> simple to use and operate than MM2, while providing better guarantees.
> From the implementation standpoint, I think that CCR will be
> significantly more complex, as the architecture of MM2 leverages a lot
> of the Connect infrastructure.
>
> LaK-1: Yes, I think you understand what I was going for.
> LaK-2: I don't think that this is a user experience that we could add
> to CCR without changing the Kafka clients to be aware of both clusters
> concurrently. In order to redirect clients away from a failed cluster
> with a metadata refresh, the cluster that they're currently connected
> to must give them that data. But because the cluster failed, that
> refresh will not be reliable. With a proxy between the client and
> Kafka, that proxy can be available while the original Kafka cluster is
> not. Failovers would happen between distinct sets of clients that are
> part of the same logical application.
>
> Thanks for taking a look at the rejected alternatives!
> Greg
>
> On Tue, Oct 3, 2023 at 3:24 AM Viktor Somogyi-Vass
>  wrote:
> >
> > Hi Greg,
> >
> > Seems like finding the perfect replication solution is a never ending
> story
> > for Kafka :).
> >
> > Some general thoughts:
> > GT-1. While as you say it would be good to have some kind of built-in
> > replication in Kafka, we definitely need to understand the problem better
> > to provide a better solution. Replication has lots of user stories as you
> > iterated over a few and I think it's very well worth the time to detail
> > each one in the KIP. This may help understanding the problem on a deeper
> > level to others who may want to contribute, somewhat sets the scope and
> > describes the problem in a way that a good solution can be deduced from
> it.
> >
> > I also have a few questions regarding some of the rejected solutions:
> >
> > MM2:
> > I think your points about MM2 are fair (offset transparency and
> operational
> > complexity), however I think it needs more reasoning about why are we
> > moving in a different direction?
> > A few points I can think about what we could improve in MM2 that'd
> > transform it into more like a solution that you aim for:
> > MM2-1. What if we consider replacing the client based mechanism with a
> > follower fetch protocol?
> > MM2-2. Operating an MM2 cluster might be familiar to those who operate
> > Connect anyway. For those who don't, can we provide a "built-in" version
> > that runs in the same process as Kafka, like an embedded dedicated MM2
> > cluster?
> > MM2-3. Will we actually be able to achieve less complexity with a
> built-in
> > solution?
> >
> > Layer above Kafka:
> > LaK-1. Would you please add more details about this? What I can currently
> > think of is that this "layer above Kafka" would be some kind of a proxy
> > which would proactively send an incoming request to multiple clusters
> like
> > 

Re: [PR] MINOR: document how we deal with advisories for dependencies [kafka-site]

2023-10-04 Thread via GitHub


raboof commented on PR #554:
URL: https://github.com/apache/kafka-site/pull/554#issuecomment-1746475927

   > Thanks @raboof , can you put a description for detailing the purpose of 
this pull request ?
   
   (updated the commit and PR message)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (KAFKA-15537) Unsafe metadata.version downgrade is not supported

2023-10-04 Thread Federico Valeri (Jira)
Federico Valeri created KAFKA-15537:
---

 Summary: Unsafe metadata.version downgrade is not supported
 Key: KAFKA-15537
 URL: https://issues.apache.org/jira/browse/KAFKA-15537
 Project: Kafka
  Issue Type: Bug
Reporter: Federico Valeri


In KIP-778 we introduced the "unsafe" downgrade functionality in case one of 
the metadata versions between current and target have changes, as defined in 
MetadataVersion. This is a lossy downgrade where each node rebuilds its 
metadata snapshots, omitting the new metadata fields. Currently, this is not 
supported, as shown by the following command.

{code}
bin/kafka-features.sh --bootstrap-server :9092 downgrade --metadata 3.4 --unsafe
Could not downgrade metadata.version to 8. Invalid metadata.version 8.
Unsafe metadata downgrade is not supported in this version.
1 out of 1 operation(s) failed.
{code}





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] MINOR: document how we deal with advisories for dependencies [kafka-site]

2023-10-04 Thread via GitHub


raboof commented on code in PR #554:
URL: https://github.com/apache/kafka-site/pull/554#discussion_r1345493725


##
project-security.html:
##
@@ -35,6 +35,22 @@ Kafka security

 For a list of security issues fixed in released 
versions of Apache Kafka, see CVE list.

+   Advisories for dependencies
+   
+   Many organizations use 'security scanning' tools to 
detect components for which advisories exist. While we generally encourage 
using such tools, since they are an important way users are notified of risks, 
our experience is that they produce a lot of false positives: when a dependency 
of Kafka contains a vulnerability, it is likely Kafka is using it in a way that 
is not affected. As such, we do not consider the fact that an advisory has been 
published for a Kafka dependency sensitive. Only when additional analysis 
confirms Kafka is affected by the problem, we ask you to report this finding 
privately through mailto:secur...@kafka.apache.org?Subject=[SECURITY] 
My security issue" target="_top">secur...@kafka.apache.org.

Review Comment:
   I'm happy to adapt the wording - of course it's up to you as PMC to decide 
how you want to deal with such cases.
   
   For context: since anyone can easily run a dependency scanner, it's been our 
general policy to consider the mere fact that an advisory exists for a 
dependency already public knowledge. If a user is unsure about the impact, that 
doesn't really introduce any new information, so private reporting might not 
yet be necessary at that point.
   
   I now changed the wording from "Only when additional analysis confirms Kafka 
is affected" to "Only when additional analysis suggests Kafka may be affected' 
- let me know if you'd like to see a stronger bias towards private reporting.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Added a blog entry for 3.6.0 release [kafka-site]

2023-10-04 Thread via GitHub


mimaison commented on code in PR #547:
URL: https://github.com/apache/kafka-site/pull/547#discussion_r1345510589


##
blog.html:
##
@@ -22,6 +22,54 @@
 
 
 Blog
+
+
+
+Apache 
Kafka 3.6.0 Release Announcement
+
+15 Sep 2023 - Satish Duggana (https://twitter.com/0xeed;>@SatishDuggana)
+We are proud to announce the release of Apache Kafka 3.6.0. 
This release contains many new features and improvements. This blog post will 
highlight some of the more prominent features. For a full list of changes, be 
sure to check the https://downloads.apache.org/kafka/3.6.0/RELEASE_NOTES.html;>release 
notes.
+See the https://kafka.apache.org/36/documentation.html#upgrade_3_6_0;>Upgrading 
to 3.6.0 from any version 0.8.x through 3.5.x section in the documentation 
for the list of notable changes and detailed upgrade steps.
+
+The ability to migrate Kafka clusters from a ZooKeeper 
metadata system to a KRaft metadata system is
+now considered stable and suitable for production 
environments. See the ZooKeeper to KRaft migration
+https://kafka.apache.org/documentation/#kraft_zk_migration;>operations 
documentation for
+details. Note that support for JBOD is still not available 
for KRaft clusters, therefor clusters
+utilizing JBOD cannot be migrated. See https://cwiki.apache.org/confluence/display/KAFKA/KIP-858%3A+Handle+JBOD+broker+disk+failure+in+KRaft;>KIP-858
+for details regarding KRaft and JBOD.
+
+Support for Delegation Tokens in KRaft (https://issues.apache.org/jira/browse/KAFKA-15219;>KAFKA-15219) was 
completed in 3.6, further reducing the gap of features between ZooKeeper-based 
Kafka clusters and KRaft. Migration of delegation tokens from ZooKeeper to 
KRaft is also included in 3.6.
+https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage;>Tiered
 Storage is an early access feature. It is currently only suitable for 
testing in non production environments. See https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Tiered+Storage+Early+Access+Release+Notes;>Early
 Access Notes for more details.
+
+Note: ZooKeeper is marked as deprecated since 3.5.0 
release. ZooKeeper is planned to be removed in Apache Kafka 4.0. (Cf ZooKeeper Deprecation)
+Kafka Broker, Controller, Producer, Consumer and Admin 
Client
+
+KIP-405: Kafka Tiered Storagehttps://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage;>KIP-405.
 It introduces tiered storage feature in Kafka that provides separation of 
computation and storage in the broker.
+KIP-797: Accept duplicate listener on port for 
IPv4/IPv6https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726330;>KIP-797.
 Brokers can be configured with listeners that have same port on different ip 
stack like ipv4 and ipv6.
+KIP-863: Reduce CompletedFetch#parseRecord() memory 
copyhttps://cwiki.apache.org/confluence/pages/viewpage.action?pageId=225152035;>KIP-863.
 Reduced CompletedFetch#parseRecord() memory copy by deserializing using byte 
buffers.

Review Comment:
   ```suggestion
   KIP-863: Reduce CompletedFetch#parseRecord() memory 
copy: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=225152035;>KIP-863
 reduces CompletedFetch#parseRecord() memory copy by deserializing using byte 
buffers.
   ```



##
blog.html:
##
@@ -22,6 +22,54 @@
 
 
 Blog
+
+
+
+Apache 
Kafka 3.6.0 Release Announcement
+
+15 Sep 2023 - Satish Duggana (https://twitter.com/0xeed;>@SatishDuggana)
+We are proud to announce the release of Apache Kafka 3.6.0. 
This release contains many new features and improvements. This blog post will 
highlight some of the more prominent features. For a full list of changes, be 
sure to check the https://downloads.apache.org/kafka/3.6.0/RELEASE_NOTES.html;>release 
notes.
+See the https://kafka.apache.org/36/documentation.html#upgrade_3_6_0;>Upgrading 
to 3.6.0 from any version 0.8.x through 3.5.x section in the documentation 
for the list of notable changes and detailed upgrade steps.
+
+The ability to migrate Kafka clusters from a ZooKeeper 
metadata system to a KRaft metadata system is
+now considered stable and suitable for production 
environments. See the ZooKeeper to KRaft migration
+https://kafka.apache.org/documentation/#kraft_zk_migration;>operations 
documentation for
+ 

[jira] [Resolved] (KAFKA-10199) Separate state restoration into separate threads

2023-10-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna resolved KAFKA-10199.
---
Resolution: Done

> Separate state restoration into separate threads
> 
>
> Key: KAFKA-10199
> URL: https://issues.apache.org/jira/browse/KAFKA-10199
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Bruno Cadonna
>Priority: Major
>  Labels: new-streams-runtime-should-fix
>
> As part of the restoration optimization effort, we would like to move the 
> restoration process to separate threads such that:
> 1. Stream threads would not be restricted by the main consumer `poll` 
> frequency to keep as part of the group.
> 2. We can allow larger batches of data to be written into the restoration.
> Besides this, we'd also like to fix the known issues that for piggy-backed 
> source topics as changelog topics, the serde exception / extra processing 
> logic would be skipped.
> We would also cleanup the global update tasks as part of this effort to 
> consolidate to the separate restoration threads, and would also gear them up 
> with corresponding monitoring metrics (KIPs in progress).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-932: Queues for Kafka

2023-10-04 Thread Jack Vanlightly
I would like to see more explicit discussion of topic retention and share 
groups. There are a few options here from simple to more sophisticated. There 
are also topic-level and share-group level options.

The simple thing would be to ensure that the SPSO of each share group is 
bounded by the Log Start Offset (LSO) of each partition which itself is managed 
by the retention policy. This is a topic-level control which applies to all 
share-groups. I would say that this shared retention is the largest drawback of 
modeling queues on shared logs and this is worth noting.

More sophisticated approaches can be to allow the LSO to advance not (only) by 
retention policy but by the advancement of the lowest SPSO. This can keep the 
amount of data lower by garbage collecting messages that have been acknowledged 
by all share groups. Some people may like that behaviour on those topics where 
share groups are the only consumption model and no replay is needed.

There are per-share-group possibilities such as share-group TTLs where messages 
can be archived on a per share group basis.

Thanks
Jack


Re: [PR] MINOR: document how we deal with advisories for dependencies [kafka-site]

2023-10-04 Thread via GitHub


mimaison commented on code in PR #554:
URL: https://github.com/apache/kafka-site/pull/554#discussion_r1345492786


##
project-security.html:
##
@@ -35,6 +35,22 @@ Kafka security

 For a list of security issues fixed in released 
versions of Apache Kafka, see CVE list.

+   Advisories for dependencies
+   
+   Many organizations use 'security scanning' tools to 
detect components for which advisories exist. While we generally encourage 
using such tools, since they are an important way users are notified of risks, 
our experience is that they produce a lot of false positives: when a dependency 
of Kafka contains a vulnerability, it is likely Kafka is using it in a way that 
is not affected. As such, we do not consider the fact that an advisory has been 
published for a Kafka dependency sensitive. Only when additional analysis 
confirms Kafka is affected by the problem, we ask you to report this finding 
privately through mailto:secur...@kafka.apache.org?Subject=[SECURITY] 
My security issue" target="_top">secur...@kafka.apache.org.

Review Comment:
   +1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-04 Thread Lucas Brutschy
Hi Hanyu,

Thanks a lot for the KIP! I agree with Bruno's comments about fluent
API / keeping the query classes immutable. Apart from that, on the
technical side this is looking good already!

Two comments on the KIP itself (not the technical part):
 - The motivation section could be extended by a short paragraph why
we want to have `reverseRange` / `reverseAll` in the first place.
 - Consider using something like languagetool.org to avoid spelling /
grammar mistakes, which helps readability.

Cheers,
Lucas

On Wed, Oct 4, 2023 at 10:24 AM Bruno Cadonna  wrote:
>
> Hi Hanyu,
>
> I agree with what others said about having a `withDescendingOrder()`
> method and about to document how the results are ordered.
>
> I would not add a reverse flag and adding a parameter to each method in
> RangeQuery. This makes the API less fluent and harder to maintain since
> the flag would change all methods. There is no constraint to only add
> static factory methods to RangeQuery. In fact, if you look into the
> existing class KeyQuery, more precisely at skipCache() and into the
> proposals for queries of versioned state stores, i.e., KIP-969, KIP-968,
> and KIP-960, we already have examples where we set a flag with a
> instance method, for example, asOf(). Such methods make the API more
> fluent and limit the blast radius of the flag to only one public method
> (plus the getter).
>
> So, making a query that reads the state store in reversed order would
> then result in:
>
> final RangeQuery query = RangeQuery.withRange(1,
> 1000).withDescendingKeys();
>
> I think this is more readable than:
>
> final RangeQuery query = RangeQuery.withRange(1, 1000,
> true);
>
> Additionally, I think the KIP would benefit from a usage example of the
> newly introduced methods like in KIP-969 etc.
>
> In my opinion, the test plan should also mention that you plan to
> write/adapt unit tests.
>
> Best,
> Bruno
>
> On 10/4/23 5:16 AM, Hanyu (Peter) Zheng wrote:
> > If we use  WithDescendingKeys() to generate a RangeQuery to do the
> > reveseQuery, how do we achieve the methods like withRange, withUpperBound,
> > and withLowerBound only in this method?
> >
> > On Tue, Oct 3, 2023 at 8:01 PM Hanyu (Peter) Zheng 
> > wrote:
> >
> >> I believe there's no need to introduce a method like WithDescendingKeys().
> >> Instead, we can simply add a reverse flag to RangeQuery. Each method within
> >> RangeQuery would then accept an additional parameter. If the reverse is set
> >> to true, it would indicate the results should be reversed.
> >>
> >> Initially, I introduced a reverse variable. When set to false, the
> >> RangeQuery class behaves normally. However, when reverse is set to true,
> >> the RangeQuery essentially takes on the functionality of ReverseRangeQuery.
> >> Further details can be found in the "Rejected Alternatives" section.
> >>
> >> In my perspective, RangeQuery is a class responsible for creating a series
> >> of RangeQuery objects. It offers methods such as withRange, withUpperBound,
> >> and withLowerBound, allowing us to generate objects representing different
> >> queries. I'm unsure how adding a withDescendingOrder() method would be
> >> compatible with the other methods, especially considering that, based on
> >> KIP 969, WithDescendingKeys() doesn't appear to take any input variables.
> >> And if withDescendingOrder() doesn't accept any input, how does it return a
> >> RangeQuery?
> >>
> >> On Tue, Oct 3, 2023 at 4:37 PM Hanyu (Peter) Zheng 
> >> wrote:
> >>
> >>> Hi, Colt,
> >>> The underlying structure of inMemoryKeyValueStore is treeMap.
> >>> Sincerely,
> >>> Hanyu
> >>>
> >>> On Tue, Oct 3, 2023 at 4:34 PM Hanyu (Peter) Zheng 
> >>> wrote:
> >>>
>  Hi Bill,
>  1. I will update the KIP in accordance with the PR and synchronize their
>  future updates.
>  2. I will use that name.
>  3. you mean add something about ordering at the motivation section?
> 
>  Sincerely,
>  Hanyu
> 
> 
>  On Tue, Oct 3, 2023 at 4:29 PM Hanyu (Peter) Zheng 
>  wrote:
> 
> > Hi, Walker,
> >
> > 1. I will update the KIP in accordance with the PR and synchronize
> > their future updates.
> > 2. I will use that name.
> > 3. I'll provide additional details in that section.
> > 4. I intend to utilize rangeQuery to achieve what we're referring to as
> > reverseQuery. In essence, reverseQuery is merely a term. To clear up any
> > ambiguity, I'll make necessary adjustments to the KIP.
> >
> > Sincerely,
> > Hanyu
> >
> >
> >
> > On Tue, Oct 3, 2023 at 4:09 PM Hanyu (Peter) Zheng 
> > wrote:
> >
> >> Ok, I will change it back to following the code, and update them
> >> together.
> >>
> >> On Tue, Oct 3, 2023 at 2:27 PM Walker Carlson
> >>  wrote:
> >>
> >>> Hello Hanyu,
> >>>
> >>> Looking over your kip things mostly make sense but I have a couple of
> >>> comments.
> >>>
> >>>
> >>>

[jira] [Resolved] (KAFKA-15164) Extract reusable logic from OffsetsForLeaderEpochClient

2023-10-04 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans resolved KAFKA-15164.

Resolution: Fixed

Included in PR https://github.com/apache/kafka/pull/14346

> Extract reusable logic from OffsetsForLeaderEpochClient
> ---
>
> Key: KAFKA-15164
> URL: https://issues.apache.org/jira/browse/KAFKA-15164
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: consumer-threading-refactor
>
> The OffsetsForLeaderEpochClient class is used for making asynchronous 
> requests to the OffsetsForLeaderEpoch API. It encapsulates the logic for:
>  * preparing the requests
>  * sending them over the network using the network client
>  * handling the response
> The new KafkaConsumer implementation, based on a new threading model, 
> requires the same logic for preparing the requests and handling the 
> responses, with different behaviour for how the request is actually sent.
> This task includes refactoring OffsetsForLeaderEpochClient by extracting out 
> the logic for preparing the requests and handling the responses. No changes 
> in the existing logic, just making the functionality available to be reused.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-10-04 Thread Jun Rao
Hi, Calvin,

Thanks for the reply. Just one more comment.

54. It seems that downgrading MV is supported. Is downgrading the software
version supported? It would be useful to document that.

Thanks,

Jun

On Tue, Oct 3, 2023 at 4:55 PM Artem Livshits
 wrote:

> Hi Colin,
>
> I think in your example "do_unclean_recovery" would need to do different
> things depending on the strategy.
>
> do_unclean_recovery() {
>if (unclean.recovery.manager.enabled) {
> if (strategy == Aggressive)
>   use UncleanRecoveryManager(waitLastKnownERL=false)  // just inspect
> logs from whoever is available
> else
>   use  UncleanRecoveryManager(waitLastKnownERL=true)  // must wait for
> at least last known ELR
>   } else {
> if (strategy == Aggressive)
>   choose the last known leader if that is available, or a random leader
> if not)
> else
>   wait for last known leader to get back
>   }
> }
>
> The idea is that the Aggressive strategy would kick in as soon as we lost
> the leader and would pick a leader from whoever is available; but the
> Balanced will only kick in when ELR is empty and will wait for the brokers
> that likely have most data to be available.
>
> On Tue, Oct 3, 2023 at 3:04 PM Colin McCabe  wrote:
>
> > On Tue, Oct 3, 2023, at 10:49, Jun Rao wrote:
> > > Hi, Calvin,
> > >
> > > Thanks for the update KIP. A few more comments.
> > >
> > > 41. Why would a user choose the option to select a random replica as
> the
> > > leader instead of using unclean.recovery.strateg=Aggressive? It seems
> > that
> > > the latter is strictly better? If that's not the case, could we fold
> this
> > > option under unclean.recovery.strategy instead of introducing a
> separate
> > > config?
> >
> > Hi Jun,
> >
> > I thought the flow of control was:
> >
> > If there is no leader for the partition {
> >   If (there are unfenced ELR members) {
> > choose_an_unfenced_ELR_member
> >   } else if (there are fenced ELR members AND strategy=Aggressive) {
> > do_unclean_recovery
> >   } else if (there are no ELR members AND strategy != None) {
> > do_unclean_recovery
> >   } else {
> > do nothing about the missing leader
> >   }
> > }
> >
> > do_unclean_recovery() {
> >if (unclean.recovery.manager.enabled) {
> > use UncleanRecoveryManager
> >   } else {
> > choose the last known leader if that is available, or a random leader
> > if not)
> >   }
> > }
> >
> > However, I think this could be clarified, especially the behavior when
> > unclean.recovery.manager.enabled=false. Inuitively the goal for
> > unclean.recovery.manager.enabled=false is to be "the same as now, mostly"
> > but it's very underspecified in the KIP, I agree.
> >
> > >
> > > 50. ElectLeadersRequest: "If more than 20 topics are included, only the
> > > first 20 will be served. Others will be returned with DesiredLeaders."
> > Hmm,
> > > not sure that I understand this. ElectLeadersResponse doesn't have a
> > > DesiredLeaders field.
> > >
> > > 51. GetReplicaLogInfo: "If more than 2000 partitions are included, only
> > the
> > > first 2000 will be served" Do we return an error for the remaining
> > > partitions? Actually, should we include an errorCode field at the
> > partition
> > > level in GetReplicaLogInfoResponse to cover non-existing partitions and
> > no
> > > authorization, etc?
> > >
> > > 52. The entry should matches => The entry should match
> > >
> > > 53. ElectLeadersRequest.DesiredLeaders: Should it be nullable since a
> > user
> > > may not specify DesiredLeaders?
> > >
> > > 54. Downgrade: Is that indeed possible? I thought earlier you said that
> > > once the new version of the records are in the metadata log, one can't
> > > downgrade since the old broker doesn't know how to parse the new
> version
> > of
> > > the metadata records?
> > >
> >
> > MetadataVersion downgrade is currently broken but we have fixing it on
> our
> > plate for Kafka 3.7.
> >
> > The way downgrade works is that "new features" are dropped, leaving only
> > the old ones.
> >
> > > 55. CleanShutdownFile: Should we add a version field for future
> > extension?
> > >
> > > 56. Config changes are public facing. Could we have a separate section
> to
> > > document all the config changes?
> >
> > +1. A separate section for this would be good.
> >
> > best,
> > Colin
> >
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Mon, Sep 25, 2023 at 4:29 PM Calvin Liu  >
> > > wrote:
> > >
> > >> Hi Jun
> > >> Thanks for the comments.
> > >>
> > >> 40. If we change to None, it is not guaranteed for no data loss. For
> > users
> > >> who are not able to validate the data with external resources, manual
> > >> intervention does not give a better result but a loss of availability.
> > So
> > >> practically speaking, the Balance mode would be a better default
> value.
> > >>
> > >> 41. No, it represents how we want to do the unclean leader election.
> If
> > it
> > >> is false, the unclean leader election will be the old random way.
> > >> 

Re: [DISCUSS] KIP-986: Cross-Cluster Replication

2023-10-04 Thread Greg Harris
Hey Viktor,

Thanks for thinking about Tiered Storage. I'm not so familiar there,
so if you could add some of your expectations about how the two
features will interact, I would appreciate that.

It appears to me that follower-fetch-from-remote is a significant
optimization within TS, and so similar optimizations to support
cross-cluster-replicate-from-remote and out-of-band remote replication
could also be desirable.
I think we can explore the idea further, and make sure that CCR is
extensible to tiered topics if it doesn't make it into the initial
implementation.

Thanks!
Greg

On Wed, Oct 4, 2023 at 6:13 AM Viktor Somogyi-Vass
 wrote:
>
> Hi Greg,
>
> Thanks for the answers. I think they all make sense.
>
> Another point I realized last evening is that now that tiered storage (TS)
> is available, it might complicate things with CCR. What I'm thinking of is
> that if you have multiple clusters in multiple regions, enabling the object
> storage's replication between zones could be much more cost efficient than
> replicating local+remote offsets through Kafka. You'd only need to copy
> local segments over and remote partition replication would be done by the
> remote layer. Or the user could simply choose to not replicate remote
> segments between regions but instead just reference them (so that the
> backup cluster's remote offsets point to the original region). These
> options however likely require bigger coordination between clusters than in
> pre-TS Kafka. Do you think we should take this into consideration in the
> design and in the UX?
>
> Thanks,
> Viktor
>
> On Tue, Oct 3, 2023 at 6:30 PM Greg Harris 
> wrote:
>
> > Hi Viktor,
> >
> > Thanks for your questions! I agree, replication is very fundamental in
> > Kafka, so it's been implemented in many different ways by different
> > people. I hope that this is the last implementation we'll need, but
> > every software engineer says that :)
> >
> > GT-1: I think as this KIP is very focused on the UX of the feature,
> > that user stories are appropriate to include. I think it isn't
> > necessary to explain how the different applications are accomplished
> > with MM2 or other solutions, but describing what they will look like
> > after this KIP would be a wonderful addition. +1
> >
> > MM2-1: I think that replacing the consumer is insufficient, as we need
> > a more expressive producer as well. This is not possible within the
> > design constraints of MM2 as a Connector, as MM2 uses the
> > connect-managed producer. This could be implemented in MM3 as a new
> > process that can use more expressive "internal clients", but then
> > we've thrown away the Connect runtime that made MM2 easier to run for
> > some users.
> > MM2-2: This is technically possible, but sounds operationally hazardous to
> > me.
> > MM2-3: From the user perspective, I believe that CCR can be made more
> > simple to use and operate than MM2, while providing better guarantees.
> > From the implementation standpoint, I think that CCR will be
> > significantly more complex, as the architecture of MM2 leverages a lot
> > of the Connect infrastructure.
> >
> > LaK-1: Yes, I think you understand what I was going for.
> > LaK-2: I don't think that this is a user experience that we could add
> > to CCR without changing the Kafka clients to be aware of both clusters
> > concurrently. In order to redirect clients away from a failed cluster
> > with a metadata refresh, the cluster that they're currently connected
> > to must give them that data. But because the cluster failed, that
> > refresh will not be reliable. With a proxy between the client and
> > Kafka, that proxy can be available while the original Kafka cluster is
> > not. Failovers would happen between distinct sets of clients that are
> > part of the same logical application.
> >
> > Thanks for taking a look at the rejected alternatives!
> > Greg
> >
> > On Tue, Oct 3, 2023 at 3:24 AM Viktor Somogyi-Vass
> >  wrote:
> > >
> > > Hi Greg,
> > >
> > > Seems like finding the perfect replication solution is a never ending
> > story
> > > for Kafka :).
> > >
> > > Some general thoughts:
> > > GT-1. While as you say it would be good to have some kind of built-in
> > > replication in Kafka, we definitely need to understand the problem better
> > > to provide a better solution. Replication has lots of user stories as you
> > > iterated over a few and I think it's very well worth the time to detail
> > > each one in the KIP. This may help understanding the problem on a deeper
> > > level to others who may want to contribute, somewhat sets the scope and
> > > describes the problem in a way that a good solution can be deduced from
> > it.
> > >
> > > I also have a few questions regarding some of the rejected solutions:
> > >
> > > MM2:
> > > I think your points about MM2 are fair (offset transparency and
> > operational
> > > complexity), however I think it needs more reasoning about why are we
> > > moving in a different direction?
> > > A few 

[jira] [Created] (KAFKA-15541) RocksDB Iterator Metrics

2023-10-04 Thread Nicholas Telford (Jira)
Nicholas Telford created KAFKA-15541:


 Summary: RocksDB Iterator Metrics
 Key: KAFKA-15541
 URL: https://issues.apache.org/jira/browse/KAFKA-15541
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Nicholas Telford
Assignee: Nicholas Telford


[KIP-989: RocksDB Iterator 
Metrics|https://cwiki.apache.org/confluence/display/KAFKA/KIP-989%3A+RocksDB+Iterator+Metrics]

RocksDB {{Iterators}} must be closed after use, to prevent memory leaks due to 
[blocks being "pinned" 
in-memory|https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB#blocks-pinned-by-iterators].
 Pinned blocks can currently be tracked via the per-store 
{{block-cache-pinned-usage}} metric. However, it's common [(and even 
recommended)|https://docs.confluent.io/platform/current/streams/developer-guide/memory-mgmt.html#rocksdb]
 to share the Block Cache among all stores in an application, to enable users 
to globally bound native memory used by RocksDB. This results in the 
{{block-cache-pinned-usage}} reporting the same memory usage for every store in 
the application, irrespective of which store is actually pinning blocks in the 
block cache.

To aid users in finding leaked Iterators, as well as identifying the cause of a 
high number of pinned blocks, we introduce two new metrics.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15543) Send HB request right after reconciliation completes

2023-10-04 Thread Lianet Magrans (Jira)
Lianet Magrans created KAFKA-15543:
--

 Summary: Send HB request right after reconciliation completes
 Key: KAFKA-15543
 URL: https://issues.apache.org/jira/browse/KAFKA-15543
 Project: Kafka
  Issue Type: Sub-task
  Components: clients, consumer
Reporter: Lianet Magrans


HeartbeatRequest manager should send HB request outside of the interval, right 
after the reconciliation process completes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15544) Enable existing client integration tests for new protocol

2023-10-04 Thread Lianet Magrans (Jira)
Lianet Magrans created KAFKA-15544:
--

 Summary: Enable existing client integration tests for new protocol 
 Key: KAFKA-15544
 URL: https://issues.apache.org/jira/browse/KAFKA-15544
 Project: Kafka
  Issue Type: Sub-task
  Components: clients, consumer
Reporter: Lianet Magrans


Enable & validate integration tests defined in `PlaintextAsyncConsumerTest`, 
that are currently disabled waiting for the client to fully support the new 
consumer group protocol.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #2255

2023-10-04 Thread Apache Jenkins Server
See 




Re: [PR] KAFKA-15483 Add KIP-938 and KIP-866 metrics to ops.html [kafka-site]

2023-10-04 Thread via GitHub


mumrah merged PR #548:
URL: https://github.com/apache/kafka-site/pull/548


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (KAFKA-15538) Resolve regex on client side when using java regex in new consumer

2023-10-04 Thread Lianet Magrans (Jira)
Lianet Magrans created KAFKA-15538:
--

 Summary: Resolve regex on client side when using java regex in new 
consumer 
 Key: KAFKA-15538
 URL: https://issues.apache.org/jira/browse/KAFKA-15538
 Project: Kafka
  Issue Type: Task
  Components: clients, consumer
Reporter: Lianet Magrans


We need to resolve a java Pattern regex on the client side to send the broker a 
list of topic names to subscribe to.

Context:

The new consumer group protocol uses [Google 
RE2/J|https://github.com/google/re2j] for regular expressions and introduces 
new methods in the consumer API to subscribe using a `SubscribePattern`. The 
subscribe using a java `Pattern` will be still supported for a while but 
eventually removed.
 * When the subscribe with SubscriptionPattern is used, the client should just 
send the regex to the broker and it will be resolved on the server side.
 * In the case of the subscribe with Pattern, the regex should be resolved on 
the client side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15438) Review exception caching logic used for reset/validate positions in async consumer

2023-10-04 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans resolved KAFKA-15438.

Resolution: Fixed

> Review exception caching logic used for reset/validate positions in async 
> consumer
> --
>
> Key: KAFKA-15438
> URL: https://issues.apache.org/jira/browse/KAFKA-15438
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Priority: Minor
>  Labels: consumer-threading-refactor
>
> The refactored async consumer reuses part of the core logic required for 
> resetting and validating positions. That currently works on the principle of 
> async requests, that reset/validate positions when responses are received. If 
> the responses include errors, or if a validation verification fails (ex. log 
> truncation detected), exceptions are saved in-memory, to be thrown on the 
> next call to the reset/validate. Note that these functionalities are 
> periodically called as part of the poll loop to update fetch positions before 
> fetching records.
>  
> As an initial implementation, the async consumer reuses this same caching 
> logic, as it has the asyn nature required. Keeping this caching logic ensure 
> that we maintaint the timing of the exceptions thrown for reset/validate 
> (they are currently not thrown when discovered, instead they are thrown on 
> the next call to reset/validate). This task aims at reviewing the 
> implications of changing this behaviour, and rely on the  completion of the 
> Reset and Validate events instead, to propagate the errors found. Note that 
> this would happen closely inter-wined with the continued poll loop, that may 
> have already issued a new reset/validate. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Unsubscribe :

2023-10-04 Thread Girish L
Dear Team

I am repeatedly sending email to dev-unsubscr...@kafka.apache.org to
unsubscribe this email address of mine from the email notifications
received from dev@kafka.apache.org.
Could one of you please help me with the correct process?

Regards
Girish


[DISCUSS] KIP-989: RocksDB Iterator Metrics

2023-10-04 Thread Nick Telford
Hi everyone,

KIP-989 is a small Kafka Streams KIP to add a few new metrics around the
creation and use of RocksDB Iterators, to aid users in identifying
"Iterator leaks" that could cause applications to leak native memory.

Let me know what you think!

https://cwiki.apache.org/confluence/display/KAFKA/KIP-989%3A+RocksDB+Iterator+Metrics

P.S. I'm not too sure about the formatting of the "New Metrics" table, any
advice there would be appreciated.

Regards,
Nick


[jira] [Created] (KAFKA-15539) Stop fetching while partitions being revoked

2023-10-04 Thread Lianet Magrans (Jira)
Lianet Magrans created KAFKA-15539:
--

 Summary: Stop fetching while partitions being revoked
 Key: KAFKA-15539
 URL: https://issues.apache.org/jira/browse/KAFKA-15539
 Project: Kafka
  Issue Type: Task
  Components: clients, consumer
Reporter: Lianet Magrans


When partitions are being revoked (client received revocation on heartbeat and 
is in the process of invoking the callback), we need to make sure we do not 
fetch from those partitions anymore:
 * no new fetches should be sent out for the partitions being revoked
 * no fetch responses should be handled for those partitions (case where a 
fetch was already in-flight when the partition revocation started.

This does not seem to be handled in the current KafkaConsumer and the old 
consumer protocol (only for the EAGER protocol). 

Consider re-using the existing pendingRevocation logic that already exist in 
the subscriptionState & used from the fetcher to determine if a partition is 
fetchable. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15540) Handle heartbeat and assignment release when consumer leaves group

2023-10-04 Thread Lianet Magrans (Jira)
Lianet Magrans created KAFKA-15540:
--

 Summary: Handle heartbeat and assignment release when consumer 
leaves group
 Key: KAFKA-15540
 URL: https://issues.apache.org/jira/browse/KAFKA-15540
 Project: Kafka
  Issue Type: Task
  Components: clients, consumer
Reporter: Lianet Magrans


When a consumer intentionally leaves a group we should:
 * release assignment (revoke partitions)
 * send a last Heartbeat request with epoch -1 (or -2 if static member)

Note that the revocation involves stop fetching, committing offsets if 
auto-commit enabled and invoking the onPartitionsRevoked callback.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15542) Release member assignments on errors

2023-10-04 Thread Lianet Magrans (Jira)
Lianet Magrans created KAFKA-15542:
--

 Summary: Release member assignments on errors
 Key: KAFKA-15542
 URL: https://issues.apache.org/jira/browse/KAFKA-15542
 Project: Kafka
  Issue Type: Sub-task
  Components: clients, consumer
Reporter: Lianet Magrans


Member should release assignment by triggering the onPartitionsLost flow from 
the HB manager when errors occur (both fencing and unrecoverable errors)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Unsubscribe :

2023-10-04 Thread Justine Olshan
Hey Girish,

You may need to confirm the unsubscription with a second email.

When I was switching subscription emails, I sent one to the unsubscribe
email and then I got a reply.
In the reply it asked me to send to a unique email address to confirm. Look
for one from dev-h...@kafka.apache.org.

It should have directions on how to unsubscribe. Let me know if you do not
get this second email to confirm the unsubscription.

Justine


On Wed, Oct 4, 2023 at 8:03 AM Girish L  wrote:

> Dear Team
>
> I am repeatedly sending email to dev-unsubscr...@kafka.apache.org to
> unsubscribe this email address of mine from the email notifications
> received from dev@kafka.apache.org.
> Could one of you please help me with the correct process?
>
> Regards
> Girish
>


Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-04 Thread Hanyu (Peter) Zheng
Hi, Lucas,

Thank you for your suggestions.
I will update the KIP and code together.

Sincerely,
Hanyu

On Tue, Oct 3, 2023 at 8:16 PM Hanyu (Peter) Zheng 
wrote:

> If we use  WithDescendingKeys() to generate a RangeQuery to do the
> reveseQuery, how do we achieve the methods like withRange, withUpperBound,
> and withLowerBound only in this method?
>
> On Tue, Oct 3, 2023 at 8:01 PM Hanyu (Peter) Zheng 
> wrote:
>
>> I believe there's no need to introduce a method like
>> WithDescendingKeys(). Instead, we can simply add a reverse flag to
>> RangeQuery. Each method within RangeQuery would then accept an additional
>> parameter. If the reverse is set to true, it would indicate the results
>> should be reversed.
>>
>> Initially, I introduced a reverse variable. When set to false, the
>> RangeQuery class behaves normally. However, when reverse is set to true,
>> the RangeQuery essentially takes on the functionality of ReverseRangeQuery.
>> Further details can be found in the "Rejected Alternatives" section.
>>
>> In my perspective, RangeQuery is a class responsible for creating a
>> series of RangeQuery objects. It offers methods such as withRange,
>> withUpperBound, and withLowerBound, allowing us to generate objects
>> representing different queries. I'm unsure how adding a
>> withDescendingOrder() method would be compatible with the other methods,
>> especially considering that, based on KIP 969, WithDescendingKeys() doesn't
>> appear to take any input variables. And if withDescendingOrder() doesn't
>> accept any input, how does it return a RangeQuery?
>>
>> On Tue, Oct 3, 2023 at 4:37 PM Hanyu (Peter) Zheng 
>> wrote:
>>
>>> Hi, Colt,
>>> The underlying structure of inMemoryKeyValueStore is treeMap.
>>> Sincerely,
>>> Hanyu
>>>
>>> On Tue, Oct 3, 2023 at 4:34 PM Hanyu (Peter) Zheng 
>>> wrote:
>>>
 Hi Bill,
 1. I will update the KIP in accordance with the PR and synchronize
 their future updates.
 2. I will use that name.
 3. you mean add something about ordering at the motivation section?

 Sincerely,
 Hanyu


 On Tue, Oct 3, 2023 at 4:29 PM Hanyu (Peter) Zheng 
 wrote:

> Hi, Walker,
>
> 1. I will update the KIP in accordance with the PR and synchronize
> their future updates.
> 2. I will use that name.
> 3. I'll provide additional details in that section.
> 4. I intend to utilize rangeQuery to achieve what we're referring to
> as reverseQuery. In essence, reverseQuery is merely a term. To clear up 
> any
> ambiguity, I'll make necessary adjustments to the KIP.
>
> Sincerely,
> Hanyu
>
>
>
> On Tue, Oct 3, 2023 at 4:09 PM Hanyu (Peter) Zheng <
> pzh...@confluent.io> wrote:
>
>> Ok, I will change it back to following the code, and update them
>> together.
>>
>> On Tue, Oct 3, 2023 at 2:27 PM Walker Carlson
>>  wrote:
>>
>>> Hello Hanyu,
>>>
>>> Looking over your kip things mostly make sense but I have a couple of
>>> comments.
>>>
>>>
>>>1. You have "withDescandingOrder()". I think you mean
>>> "descending" :)
>>>Also there are still a few places in the do where its called
>>> "setReverse"
>>>2. Also I like "WithDescendingKeys()" better
>>>3. I'm not sure of what ordering guarantees we are offering.
>>> Perhaps we
>>>can add a section to the motivation clearly spelling out the
>>> current
>>>ordering and the new offering?
>>>4. When you say "use unbounded reverseQuery to achieve
>>> reverseAll" do
>>>you mean "use unbounded RangeQuery to achieve reverseAll"? as far
>>> as I can
>>>tell we don't have a reverseQuery as a named object?
>>>
>>>
>>> Looking good so far
>>>
>>> best,
>>> Walker
>>>
>>> On Tue, Oct 3, 2023 at 2:13 PM Colt McNealy 
>>> wrote:
>>>
>>> > Hello Hanyu,
>>> >
>>> > Thank you for the KIP. I agree with Matthias' proposal to keep the
>>> naming
>>> > convention consistent with KIP-969. I favor the
>>> `.withDescendingKeys()`
>>> > name.
>>> >
>>> > I am curious about one thing. RocksDB guarantees that records
>>> returned
>>> > during a range scan are lexicographically ordered by the bytes of
>>> the keys
>>> > (either ascending or descending order, as specified in the query).
>>> This
>>> > means that results within a single partition are indeed ordered.**
>>> My
>>> > reading of KIP-805 suggests to me that you don't need to specify
>>> the
>>> > partition number you are querying in IQv2, which means that you
>>> can have a
>>> > valid reversed RangeQuery over a store with "multiple partitions"
>>> in it.
>>> >
>>> > Currently, IQv1 does not guarantee order of keys in this scenario.
>>> Does
>>> > IQv2 support ordering across partitions? Such an implementation
>>> would

Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-04 Thread Hanyu (Peter) Zheng
Hi,  Bruno

Thank you for your suggestions, I will update them soon.
Sincerely,

Hanyu

On Wed, Oct 4, 2023 at 9:25 AM Hanyu (Peter) Zheng 
wrote:

> Hi, Lucas,
>
> Thank you for your suggestions.
> I will update the KIP and code together.
>
> Sincerely,
> Hanyu
>
> On Tue, Oct 3, 2023 at 8:16 PM Hanyu (Peter) Zheng 
> wrote:
>
>> If we use  WithDescendingKeys() to generate a RangeQuery to do the
>> reveseQuery, how do we achieve the methods like withRange, withUpperBound,
>> and withLowerBound only in this method?
>>
>> On Tue, Oct 3, 2023 at 8:01 PM Hanyu (Peter) Zheng 
>> wrote:
>>
>>> I believe there's no need to introduce a method like
>>> WithDescendingKeys(). Instead, we can simply add a reverse flag to
>>> RangeQuery. Each method within RangeQuery would then accept an additional
>>> parameter. If the reverse is set to true, it would indicate the results
>>> should be reversed.
>>>
>>> Initially, I introduced a reverse variable. When set to false, the
>>> RangeQuery class behaves normally. However, when reverse is set to true,
>>> the RangeQuery essentially takes on the functionality of ReverseRangeQuery.
>>> Further details can be found in the "Rejected Alternatives" section.
>>>
>>> In my perspective, RangeQuery is a class responsible for creating a
>>> series of RangeQuery objects. It offers methods such as withRange,
>>> withUpperBound, and withLowerBound, allowing us to generate objects
>>> representing different queries. I'm unsure how adding a
>>> withDescendingOrder() method would be compatible with the other methods,
>>> especially considering that, based on KIP 969, WithDescendingKeys() doesn't
>>> appear to take any input variables. And if withDescendingOrder() doesn't
>>> accept any input, how does it return a RangeQuery?
>>>
>>> On Tue, Oct 3, 2023 at 4:37 PM Hanyu (Peter) Zheng 
>>> wrote:
>>>
 Hi, Colt,
 The underlying structure of inMemoryKeyValueStore is treeMap.
 Sincerely,
 Hanyu

 On Tue, Oct 3, 2023 at 4:34 PM Hanyu (Peter) Zheng 
 wrote:

> Hi Bill,
> 1. I will update the KIP in accordance with the PR and synchronize
> their future updates.
> 2. I will use that name.
> 3. you mean add something about ordering at the motivation section?
>
> Sincerely,
> Hanyu
>
>
> On Tue, Oct 3, 2023 at 4:29 PM Hanyu (Peter) Zheng <
> pzh...@confluent.io> wrote:
>
>> Hi, Walker,
>>
>> 1. I will update the KIP in accordance with the PR and synchronize
>> their future updates.
>> 2. I will use that name.
>> 3. I'll provide additional details in that section.
>> 4. I intend to utilize rangeQuery to achieve what we're referring to
>> as reverseQuery. In essence, reverseQuery is merely a term. To clear up 
>> any
>> ambiguity, I'll make necessary adjustments to the KIP.
>>
>> Sincerely,
>> Hanyu
>>
>>
>>
>> On Tue, Oct 3, 2023 at 4:09 PM Hanyu (Peter) Zheng <
>> pzh...@confluent.io> wrote:
>>
>>> Ok, I will change it back to following the code, and update them
>>> together.
>>>
>>> On Tue, Oct 3, 2023 at 2:27 PM Walker Carlson
>>>  wrote:
>>>
 Hello Hanyu,

 Looking over your kip things mostly make sense but I have a couple
 of
 comments.


1. You have "withDescandingOrder()". I think you mean
 "descending" :)
Also there are still a few places in the do where its called
 "setReverse"
2. Also I like "WithDescendingKeys()" better
3. I'm not sure of what ordering guarantees we are offering.
 Perhaps we
can add a section to the motivation clearly spelling out the
 current
ordering and the new offering?
4. When you say "use unbounded reverseQuery to achieve
 reverseAll" do
you mean "use unbounded RangeQuery to achieve reverseAll"? as
 far as I can
tell we don't have a reverseQuery as a named object?


 Looking good so far

 best,
 Walker

 On Tue, Oct 3, 2023 at 2:13 PM Colt McNealy 
 wrote:

 > Hello Hanyu,
 >
 > Thank you for the KIP. I agree with Matthias' proposal to keep
 the naming
 > convention consistent with KIP-969. I favor the
 `.withDescendingKeys()`
 > name.
 >
 > I am curious about one thing. RocksDB guarantees that records
 returned
 > during a range scan are lexicographically ordered by the bytes of
 the keys
 > (either ascending or descending order, as specified in the
 query). This
 > means that results within a single partition are indeed
 ordered.** My
 > reading of KIP-805 suggests to me that you don't need to specify
 the
 > partition number you are querying in 

[jira] [Created] (KAFKA-15545) Update Request metrics in ops.html to reflect all the APIs

2023-10-04 Thread Justine Olshan (Jira)
Justine Olshan created KAFKA-15545:
--

 Summary: Update Request metrics in ops.html to reflect all the APIs
 Key: KAFKA-15545
 URL: https://issues.apache.org/jira/browse/KAFKA-15545
 Project: Kafka
  Issue Type: Task
Reporter: Justine Olshan


When updating for KAFKA-15530, I noticed that the request metrics only mention 
Produce|FetchConsumer|FetchFollower. These requests metrics apply to all APIs 
so we should update the documentation to make this clearer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-10-04 Thread Calvin Liu
Hi Jun,
54. Marked the software downgrading is not supported. As the old controller
will not understand the new PartitionRecord and PartitionChangeRecord.
Thanks!

On Wed, Oct 4, 2023 at 9:12 AM Jun Rao  wrote:

> Hi, Calvin,
>
> Thanks for the reply. Just one more comment.
>
> 54. It seems that downgrading MV is supported. Is downgrading the software
> version supported? It would be useful to document that.
>
> Thanks,
>
> Jun
>
> On Tue, Oct 3, 2023 at 4:55 PM Artem Livshits
>  wrote:
>
> > Hi Colin,
> >
> > I think in your example "do_unclean_recovery" would need to do different
> > things depending on the strategy.
> >
> > do_unclean_recovery() {
> >if (unclean.recovery.manager.enabled) {
> > if (strategy == Aggressive)
> >   use UncleanRecoveryManager(waitLastKnownERL=false)  // just inspect
> > logs from whoever is available
> > else
> >   use  UncleanRecoveryManager(waitLastKnownERL=true)  // must wait
> for
> > at least last known ELR
> >   } else {
> > if (strategy == Aggressive)
> >   choose the last known leader if that is available, or a random
> leader
> > if not)
> > else
> >   wait for last known leader to get back
> >   }
> > }
> >
> > The idea is that the Aggressive strategy would kick in as soon as we lost
> > the leader and would pick a leader from whoever is available; but the
> > Balanced will only kick in when ELR is empty and will wait for the
> brokers
> > that likely have most data to be available.
> >
> > On Tue, Oct 3, 2023 at 3:04 PM Colin McCabe  wrote:
> >
> > > On Tue, Oct 3, 2023, at 10:49, Jun Rao wrote:
> > > > Hi, Calvin,
> > > >
> > > > Thanks for the update KIP. A few more comments.
> > > >
> > > > 41. Why would a user choose the option to select a random replica as
> > the
> > > > leader instead of using unclean.recovery.strateg=Aggressive? It seems
> > > that
> > > > the latter is strictly better? If that's not the case, could we fold
> > this
> > > > option under unclean.recovery.strategy instead of introducing a
> > separate
> > > > config?
> > >
> > > Hi Jun,
> > >
> > > I thought the flow of control was:
> > >
> > > If there is no leader for the partition {
> > >   If (there are unfenced ELR members) {
> > > choose_an_unfenced_ELR_member
> > >   } else if (there are fenced ELR members AND strategy=Aggressive) {
> > > do_unclean_recovery
> > >   } else if (there are no ELR members AND strategy != None) {
> > > do_unclean_recovery
> > >   } else {
> > > do nothing about the missing leader
> > >   }
> > > }
> > >
> > > do_unclean_recovery() {
> > >if (unclean.recovery.manager.enabled) {
> > > use UncleanRecoveryManager
> > >   } else {
> > > choose the last known leader if that is available, or a random
> leader
> > > if not)
> > >   }
> > > }
> > >
> > > However, I think this could be clarified, especially the behavior when
> > > unclean.recovery.manager.enabled=false. Inuitively the goal for
> > > unclean.recovery.manager.enabled=false is to be "the same as now,
> mostly"
> > > but it's very underspecified in the KIP, I agree.
> > >
> > > >
> > > > 50. ElectLeadersRequest: "If more than 20 topics are included, only
> the
> > > > first 20 will be served. Others will be returned with
> DesiredLeaders."
> > > Hmm,
> > > > not sure that I understand this. ElectLeadersResponse doesn't have a
> > > > DesiredLeaders field.
> > > >
> > > > 51. GetReplicaLogInfo: "If more than 2000 partitions are included,
> only
> > > the
> > > > first 2000 will be served" Do we return an error for the remaining
> > > > partitions? Actually, should we include an errorCode field at the
> > > partition
> > > > level in GetReplicaLogInfoResponse to cover non-existing partitions
> and
> > > no
> > > > authorization, etc?
> > > >
> > > > 52. The entry should matches => The entry should match
> > > >
> > > > 53. ElectLeadersRequest.DesiredLeaders: Should it be nullable since a
> > > user
> > > > may not specify DesiredLeaders?
> > > >
> > > > 54. Downgrade: Is that indeed possible? I thought earlier you said
> that
> > > > once the new version of the records are in the metadata log, one
> can't
> > > > downgrade since the old broker doesn't know how to parse the new
> > version
> > > of
> > > > the metadata records?
> > > >
> > >
> > > MetadataVersion downgrade is currently broken but we have fixing it on
> > our
> > > plate for Kafka 3.7.
> > >
> > > The way downgrade works is that "new features" are dropped, leaving
> only
> > > the old ones.
> > >
> > > > 55. CleanShutdownFile: Should we add a version field for future
> > > extension?
> > > >
> > > > 56. Config changes are public facing. Could we have a separate
> section
> > to
> > > > document all the config changes?
> > >
> > > +1. A separate section for this would be good.
> > >
> > > best,
> > > Colin
> > >
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Mon, Sep 25, 2023 at 4:29 PM Calvin Liu
>  > >
> > > > wrote:
> > > >
> > > >> Hi Jun
> > > >> Thanks for 

[PR] KAFKA-15530: Add 3.6 metrics documentation for new transactions metrics [kafka-site]

2023-10-04 Thread via GitHub


jolshan opened a new pull request, #555:
URL: https://github.com/apache/kafka-site/pull/555

   Corresponding kafka-site PR for https://github.com/apache/kafka/pull/14480


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-04 Thread Hanyu (Peter) Zheng
For testing purposes, we previously used a Set to record the results in
IQv2StoreIntegrationTest. Let's take an example where we now have two
partitions and four key-value pairs: <0,0> in p0, <1,1> in p1, <2,2> in p0,
and <3,3> in p1.

If we execute withRange(1,3), it will return a Set of <1, 2, 3>. However,
if we run withRange(1,3).withDescendingKeys(), and still use a Set, the
result will again be a Set of <1,2,3>. This means we won't be able to
determine whether the results have been reversed.

To resolve this ambiguity, I've switched to using a List to record the
results, ensuring the order of retrieval from partitions p0 and p1. So,
withRange(1,3) would yield a List of [2, 1, 3], whereas
withRange(1,3).withDescendingKeys() would produce a List of [2,3,1].

This ordering makes sense since RocksDB sorts its keys, and InMemoryStore
uses a TreeMap structure, which means the keys are already sorted.

Sincerely,
Hanyu

On Wed, Oct 4, 2023 at 9:25 AM Hanyu (Peter) Zheng 
wrote:

> Hi,  Bruno
>
> Thank you for your suggestions, I will update them soon.
> Sincerely,
>
> Hanyu
>
> On Wed, Oct 4, 2023 at 9:25 AM Hanyu (Peter) Zheng 
> wrote:
>
>> Hi, Lucas,
>>
>> Thank you for your suggestions.
>> I will update the KIP and code together.
>>
>> Sincerely,
>> Hanyu
>>
>> On Tue, Oct 3, 2023 at 8:16 PM Hanyu (Peter) Zheng 
>> wrote:
>>
>>> If we use  WithDescendingKeys() to generate a RangeQuery to do the
>>> reveseQuery, how do we achieve the methods like withRange, withUpperBound,
>>> and withLowerBound only in this method?
>>>
>>> On Tue, Oct 3, 2023 at 8:01 PM Hanyu (Peter) Zheng 
>>> wrote:
>>>
 I believe there's no need to introduce a method like
 WithDescendingKeys(). Instead, we can simply add a reverse flag to
 RangeQuery. Each method within RangeQuery would then accept an additional
 parameter. If the reverse is set to true, it would indicate the results
 should be reversed.

 Initially, I introduced a reverse variable. When set to false, the
 RangeQuery class behaves normally. However, when reverse is set to true,
 the RangeQuery essentially takes on the functionality of ReverseRangeQuery.
 Further details can be found in the "Rejected Alternatives" section.

 In my perspective, RangeQuery is a class responsible for creating a
 series of RangeQuery objects. It offers methods such as withRange,
 withUpperBound, and withLowerBound, allowing us to generate objects
 representing different queries. I'm unsure how adding a
 withDescendingOrder() method would be compatible with the other methods,
 especially considering that, based on KIP 969, WithDescendingKeys() doesn't
 appear to take any input variables. And if withDescendingOrder() doesn't
 accept any input, how does it return a RangeQuery?

 On Tue, Oct 3, 2023 at 4:37 PM Hanyu (Peter) Zheng 
 wrote:

> Hi, Colt,
> The underlying structure of inMemoryKeyValueStore is treeMap.
> Sincerely,
> Hanyu
>
> On Tue, Oct 3, 2023 at 4:34 PM Hanyu (Peter) Zheng <
> pzh...@confluent.io> wrote:
>
>> Hi Bill,
>> 1. I will update the KIP in accordance with the PR and synchronize
>> their future updates.
>> 2. I will use that name.
>> 3. you mean add something about ordering at the motivation section?
>>
>> Sincerely,
>> Hanyu
>>
>>
>> On Tue, Oct 3, 2023 at 4:29 PM Hanyu (Peter) Zheng <
>> pzh...@confluent.io> wrote:
>>
>>> Hi, Walker,
>>>
>>> 1. I will update the KIP in accordance with the PR and synchronize
>>> their future updates.
>>> 2. I will use that name.
>>> 3. I'll provide additional details in that section.
>>> 4. I intend to utilize rangeQuery to achieve what we're referring to
>>> as reverseQuery. In essence, reverseQuery is merely a term. To clear up 
>>> any
>>> ambiguity, I'll make necessary adjustments to the KIP.
>>>
>>> Sincerely,
>>> Hanyu
>>>
>>>
>>>
>>> On Tue, Oct 3, 2023 at 4:09 PM Hanyu (Peter) Zheng <
>>> pzh...@confluent.io> wrote:
>>>
 Ok, I will change it back to following the code, and update them
 together.

 On Tue, Oct 3, 2023 at 2:27 PM Walker Carlson
  wrote:

> Hello Hanyu,
>
> Looking over your kip things mostly make sense but I have a couple
> of
> comments.
>
>
>1. You have "withDescandingOrder()". I think you mean
> "descending" :)
>Also there are still a few places in the do where its called
> "setReverse"
>2. Also I like "WithDescendingKeys()" better
>3. I'm not sure of what ordering guarantees we are offering.
> Perhaps we
>can add a section to the motivation clearly spelling out the
> current
>ordering and the new offering?
>4. When you say "use unbounded 

[jira] [Resolved] (KAFKA-15483) Update metrics documentation for the new metrics implemented as part of KIP-938

2023-10-04 Thread Divij Vaidya (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Divij Vaidya resolved KAFKA-15483.
--
Resolution: Fixed

> Update metrics documentation for the new metrics implemented as part of 
> KIP-938
> ---
>
> Key: KAFKA-15483
> URL: https://issues.apache.org/jira/browse/KAFKA-15483
> Project: Kafka
>  Issue Type: Task
>  Components: docs, documentation
>Affects Versions: 3.6.0
>Reporter: Satish Duggana
>Assignee: David Arthur
>Priority: Major
> Fix For: 3.6.0
>
>
> Update the kafka-site documentation for 3.6 release with the newly introduced 
> metrics in 3.6 for KIP-938. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15546) Transactions tool duration field confusing for completed transactions

2023-10-04 Thread Justine Olshan (Jira)
Justine Olshan created KAFKA-15546:
--

 Summary: Transactions tool duration field confusing for completed 
transactions
 Key: KAFKA-15546
 URL: https://issues.apache.org/jira/browse/KAFKA-15546
 Project: Kafka
  Issue Type: Task
Reporter: Justine Olshan
Assignee: Justine Olshan


When using the transactions tool to describe transactions, if the transaction 
is completed, its duration will still increase based on when it started. This 
value is not correct. Instead, we can leave the duration field blank (since we 
don't have the data for the completed transaction in the describe response).

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-10-04 Thread Justine Olshan
Sorry -- not MV but software version.

On Wed, Oct 4, 2023 at 9:51 AM Justine Olshan  wrote:

> Catching up with this discussion.
>
> I was just curious -- have we had other instances where downgrading MV is
> not supported? I think Kafka typically tries to support downgrades, and I
> couldn't think of other examples.
>
> Thanks,
> Justine
>
> On Wed, Oct 4, 2023 at 9:40 AM Calvin Liu 
> wrote:
>
>> Hi Jun,
>> 54. Marked the software downgrading is not supported. As the old
>> controller
>> will not understand the new PartitionRecord and PartitionChangeRecord.
>> Thanks!
>>
>> On Wed, Oct 4, 2023 at 9:12 AM Jun Rao  wrote:
>>
>> > Hi, Calvin,
>> >
>> > Thanks for the reply. Just one more comment.
>> >
>> > 54. It seems that downgrading MV is supported. Is downgrading the
>> software
>> > version supported? It would be useful to document that.
>> >
>> > Thanks,
>> >
>> > Jun
>> >
>> > On Tue, Oct 3, 2023 at 4:55 PM Artem Livshits
>> >  wrote:
>> >
>> > > Hi Colin,
>> > >
>> > > I think in your example "do_unclean_recovery" would need to do
>> different
>> > > things depending on the strategy.
>> > >
>> > > do_unclean_recovery() {
>> > >if (unclean.recovery.manager.enabled) {
>> > > if (strategy == Aggressive)
>> > >   use UncleanRecoveryManager(waitLastKnownERL=false)  // just
>> inspect
>> > > logs from whoever is available
>> > > else
>> > >   use  UncleanRecoveryManager(waitLastKnownERL=true)  // must wait
>> > for
>> > > at least last known ELR
>> > >   } else {
>> > > if (strategy == Aggressive)
>> > >   choose the last known leader if that is available, or a random
>> > leader
>> > > if not)
>> > > else
>> > >   wait for last known leader to get back
>> > >   }
>> > > }
>> > >
>> > > The idea is that the Aggressive strategy would kick in as soon as we
>> lost
>> > > the leader and would pick a leader from whoever is available; but the
>> > > Balanced will only kick in when ELR is empty and will wait for the
>> > brokers
>> > > that likely have most data to be available.
>> > >
>> > > On Tue, Oct 3, 2023 at 3:04 PM Colin McCabe 
>> wrote:
>> > >
>> > > > On Tue, Oct 3, 2023, at 10:49, Jun Rao wrote:
>> > > > > Hi, Calvin,
>> > > > >
>> > > > > Thanks for the update KIP. A few more comments.
>> > > > >
>> > > > > 41. Why would a user choose the option to select a random replica
>> as
>> > > the
>> > > > > leader instead of using unclean.recovery.strateg=Aggressive? It
>> seems
>> > > > that
>> > > > > the latter is strictly better? If that's not the case, could we
>> fold
>> > > this
>> > > > > option under unclean.recovery.strategy instead of introducing a
>> > > separate
>> > > > > config?
>> > > >
>> > > > Hi Jun,
>> > > >
>> > > > I thought the flow of control was:
>> > > >
>> > > > If there is no leader for the partition {
>> > > >   If (there are unfenced ELR members) {
>> > > > choose_an_unfenced_ELR_member
>> > > >   } else if (there are fenced ELR members AND strategy=Aggressive) {
>> > > > do_unclean_recovery
>> > > >   } else if (there are no ELR members AND strategy != None) {
>> > > > do_unclean_recovery
>> > > >   } else {
>> > > > do nothing about the missing leader
>> > > >   }
>> > > > }
>> > > >
>> > > > do_unclean_recovery() {
>> > > >if (unclean.recovery.manager.enabled) {
>> > > > use UncleanRecoveryManager
>> > > >   } else {
>> > > > choose the last known leader if that is available, or a random
>> > leader
>> > > > if not)
>> > > >   }
>> > > > }
>> > > >
>> > > > However, I think this could be clarified, especially the behavior
>> when
>> > > > unclean.recovery.manager.enabled=false. Inuitively the goal for
>> > > > unclean.recovery.manager.enabled=false is to be "the same as now,
>> > mostly"
>> > > > but it's very underspecified in the KIP, I agree.
>> > > >
>> > > > >
>> > > > > 50. ElectLeadersRequest: "If more than 20 topics are included,
>> only
>> > the
>> > > > > first 20 will be served. Others will be returned with
>> > DesiredLeaders."
>> > > > Hmm,
>> > > > > not sure that I understand this. ElectLeadersResponse doesn't
>> have a
>> > > > > DesiredLeaders field.
>> > > > >
>> > > > > 51. GetReplicaLogInfo: "If more than 2000 partitions are included,
>> > only
>> > > > the
>> > > > > first 2000 will be served" Do we return an error for the remaining
>> > > > > partitions? Actually, should we include an errorCode field at the
>> > > > partition
>> > > > > level in GetReplicaLogInfoResponse to cover non-existing
>> partitions
>> > and
>> > > > no
>> > > > > authorization, etc?
>> > > > >
>> > > > > 52. The entry should matches => The entry should match
>> > > > >
>> > > > > 53. ElectLeadersRequest.DesiredLeaders: Should it be nullable
>> since a
>> > > > user
>> > > > > may not specify DesiredLeaders?
>> > > > >
>> > > > > 54. Downgrade: Is that indeed possible? I thought earlier you said
>> > that
>> > > > > once the new version of the records are in the metadata log, one
>> > can't
>> > > > > downgrade 

[jira] [Created] (KAFKA-15547) Thread leak in MirrorMakerConfigTest#testClientConfigProperties

2023-10-04 Thread Kalpesh Patel (Jira)
Kalpesh Patel created KAFKA-15547:
-

 Summary: Thread leak in 
MirrorMakerConfigTest#testClientConfigProperties
 Key: KAFKA-15547
 URL: https://issues.apache.org/jira/browse/KAFKA-15547
 Project: Kafka
  Issue Type: Bug
Reporter: Kalpesh Patel


The test MirrorMakerConfigTest#testClientConfigProperties opens a 
ForwardingAdmin but fails to close it.

we should enclose this in a try-with-resources statement to ensure the Admin 
client is closed and there is no thread leak



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-10-04 Thread Justine Olshan
Catching up with this discussion.

I was just curious -- have we had other instances where downgrading MV is
not supported? I think Kafka typically tries to support downgrades, and I
couldn't think of other examples.

Thanks,
Justine

On Wed, Oct 4, 2023 at 9:40 AM Calvin Liu 
wrote:

> Hi Jun,
> 54. Marked the software downgrading is not supported. As the old controller
> will not understand the new PartitionRecord and PartitionChangeRecord.
> Thanks!
>
> On Wed, Oct 4, 2023 at 9:12 AM Jun Rao  wrote:
>
> > Hi, Calvin,
> >
> > Thanks for the reply. Just one more comment.
> >
> > 54. It seems that downgrading MV is supported. Is downgrading the
> software
> > version supported? It would be useful to document that.
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Oct 3, 2023 at 4:55 PM Artem Livshits
> >  wrote:
> >
> > > Hi Colin,
> > >
> > > I think in your example "do_unclean_recovery" would need to do
> different
> > > things depending on the strategy.
> > >
> > > do_unclean_recovery() {
> > >if (unclean.recovery.manager.enabled) {
> > > if (strategy == Aggressive)
> > >   use UncleanRecoveryManager(waitLastKnownERL=false)  // just
> inspect
> > > logs from whoever is available
> > > else
> > >   use  UncleanRecoveryManager(waitLastKnownERL=true)  // must wait
> > for
> > > at least last known ELR
> > >   } else {
> > > if (strategy == Aggressive)
> > >   choose the last known leader if that is available, or a random
> > leader
> > > if not)
> > > else
> > >   wait for last known leader to get back
> > >   }
> > > }
> > >
> > > The idea is that the Aggressive strategy would kick in as soon as we
> lost
> > > the leader and would pick a leader from whoever is available; but the
> > > Balanced will only kick in when ELR is empty and will wait for the
> > brokers
> > > that likely have most data to be available.
> > >
> > > On Tue, Oct 3, 2023 at 3:04 PM Colin McCabe 
> wrote:
> > >
> > > > On Tue, Oct 3, 2023, at 10:49, Jun Rao wrote:
> > > > > Hi, Calvin,
> > > > >
> > > > > Thanks for the update KIP. A few more comments.
> > > > >
> > > > > 41. Why would a user choose the option to select a random replica
> as
> > > the
> > > > > leader instead of using unclean.recovery.strateg=Aggressive? It
> seems
> > > > that
> > > > > the latter is strictly better? If that's not the case, could we
> fold
> > > this
> > > > > option under unclean.recovery.strategy instead of introducing a
> > > separate
> > > > > config?
> > > >
> > > > Hi Jun,
> > > >
> > > > I thought the flow of control was:
> > > >
> > > > If there is no leader for the partition {
> > > >   If (there are unfenced ELR members) {
> > > > choose_an_unfenced_ELR_member
> > > >   } else if (there are fenced ELR members AND strategy=Aggressive) {
> > > > do_unclean_recovery
> > > >   } else if (there are no ELR members AND strategy != None) {
> > > > do_unclean_recovery
> > > >   } else {
> > > > do nothing about the missing leader
> > > >   }
> > > > }
> > > >
> > > > do_unclean_recovery() {
> > > >if (unclean.recovery.manager.enabled) {
> > > > use UncleanRecoveryManager
> > > >   } else {
> > > > choose the last known leader if that is available, or a random
> > leader
> > > > if not)
> > > >   }
> > > > }
> > > >
> > > > However, I think this could be clarified, especially the behavior
> when
> > > > unclean.recovery.manager.enabled=false. Inuitively the goal for
> > > > unclean.recovery.manager.enabled=false is to be "the same as now,
> > mostly"
> > > > but it's very underspecified in the KIP, I agree.
> > > >
> > > > >
> > > > > 50. ElectLeadersRequest: "If more than 20 topics are included, only
> > the
> > > > > first 20 will be served. Others will be returned with
> > DesiredLeaders."
> > > > Hmm,
> > > > > not sure that I understand this. ElectLeadersResponse doesn't have
> a
> > > > > DesiredLeaders field.
> > > > >
> > > > > 51. GetReplicaLogInfo: "If more than 2000 partitions are included,
> > only
> > > > the
> > > > > first 2000 will be served" Do we return an error for the remaining
> > > > > partitions? Actually, should we include an errorCode field at the
> > > > partition
> > > > > level in GetReplicaLogInfoResponse to cover non-existing partitions
> > and
> > > > no
> > > > > authorization, etc?
> > > > >
> > > > > 52. The entry should matches => The entry should match
> > > > >
> > > > > 53. ElectLeadersRequest.DesiredLeaders: Should it be nullable
> since a
> > > > user
> > > > > may not specify DesiredLeaders?
> > > > >
> > > > > 54. Downgrade: Is that indeed possible? I thought earlier you said
> > that
> > > > > once the new version of the records are in the metadata log, one
> > can't
> > > > > downgrade since the old broker doesn't know how to parse the new
> > > version
> > > > of
> > > > > the metadata records?
> > > > >
> > > >
> > > > MetadataVersion downgrade is currently broken but we have fixing it
> on
> > > our
> > > > plate for Kafka 3.7.
> > > >
> > > > 

Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-10-04 Thread David Jacot
Hi Calvin,

I thought that a new snapshot with the downgraded MV is created in this
case. Isn’t it the case?

Could you also elaborate a bit more on the reasoning behind adding the
limits to the admin RPCs? This is a new pattern in Kafka so it would be
good to clear on the motivation.

Could you also explain how the client is supposed to handle the
topics/partitions above the limit? I suppose that it will have to retry
those, correct?

My understanding is that the topics/partitions above the limit will be
failed with an invalid exception error. I wonder if this choice is
judicious because the invalide request exception is usually fatal. It may
be better to use an new and explicit error for this case.

It seems that we still need to specify the changes to the admin api to
accommodate the new or updated apis. Do you plan to add them?

Best,
David

Le mer. 4 oct. 2023 à 20:39, Calvin Liu  a
écrit :

> Hi Jun,
> After the MV downgrade, the controller will write in the old version of the
> PartitionRecord/PartitionChangeRecord. If I understand correctly, it is
> possible to downgrade the software version if the controller only has to
> handle old version records.
> However, the controller will not automatically rewrite the PartitionRecord
> with the old version unless there is a partition update. Then, the user may
> have to wait an unknown amount of time before the software downgrades
> unless they do a roll to force update every partition. If it makes sense, I
> can mention these steps to do a software downgrade.
> Thanks
>
> On Wed, Oct 4, 2023 at 11:20 AM Jun Rao  wrote:
>
> > Hi, Calvin and Justine,
> >
> > Historically, when we change the record format in the log, we don't
> support
> > software version downgrading.
> >
> > For the record format change in the metadata log, have we thought about
> > forcing the write of the latest metadata records with the old version
> > during MV downgrading? This will in theory allow the old version of the
> > software to obtain the latest metadata.
> >
> > Thanks,
> >
> > Jun
> >
> > On Wed, Oct 4, 2023 at 9:53 AM Justine Olshan
>  > >
> > wrote:
> >
> > > Sorry -- not MV but software version.
> > >
> > > On Wed, Oct 4, 2023 at 9:51 AM Justine Olshan 
> > > wrote:
> > >
> > > > Catching up with this discussion.
> > > >
> > > > I was just curious -- have we had other instances where downgrading
> MV
> > is
> > > > not supported? I think Kafka typically tries to support downgrades,
> > and I
> > > > couldn't think of other examples.
> > > >
> > > > Thanks,
> > > > Justine
> > > >
> > > > On Wed, Oct 4, 2023 at 9:40 AM Calvin Liu  >
> > > > wrote:
> > > >
> > > >> Hi Jun,
> > > >> 54. Marked the software downgrading is not supported. As the old
> > > >> controller
> > > >> will not understand the new PartitionRecord and
> PartitionChangeRecord.
> > > >> Thanks!
> > > >>
> > > >> On Wed, Oct 4, 2023 at 9:12 AM Jun Rao 
> > > wrote:
> > > >>
> > > >> > Hi, Calvin,
> > > >> >
> > > >> > Thanks for the reply. Just one more comment.
> > > >> >
> > > >> > 54. It seems that downgrading MV is supported. Is downgrading the
> > > >> software
> > > >> > version supported? It would be useful to document that.
> > > >> >
> > > >> > Thanks,
> > > >> >
> > > >> > Jun
> > > >> >
> > > >> > On Tue, Oct 3, 2023 at 4:55 PM Artem Livshits
> > > >> >  wrote:
> > > >> >
> > > >> > > Hi Colin,
> > > >> > >
> > > >> > > I think in your example "do_unclean_recovery" would need to do
> > > >> different
> > > >> > > things depending on the strategy.
> > > >> > >
> > > >> > > do_unclean_recovery() {
> > > >> > >if (unclean.recovery.manager.enabled) {
> > > >> > > if (strategy == Aggressive)
> > > >> > >   use UncleanRecoveryManager(waitLastKnownERL=false)  //
> just
> > > >> inspect
> > > >> > > logs from whoever is available
> > > >> > > else
> > > >> > >   use  UncleanRecoveryManager(waitLastKnownERL=true)  //
> must
> > > wait
> > > >> > for
> > > >> > > at least last known ELR
> > > >> > >   } else {
> > > >> > > if (strategy == Aggressive)
> > > >> > >   choose the last known leader if that is available, or a
> > random
> > > >> > leader
> > > >> > > if not)
> > > >> > > else
> > > >> > >   wait for last known leader to get back
> > > >> > >   }
> > > >> > > }
> > > >> > >
> > > >> > > The idea is that the Aggressive strategy would kick in as soon
> as
> > we
> > > >> lost
> > > >> > > the leader and would pick a leader from whoever is available;
> but
> > > the
> > > >> > > Balanced will only kick in when ELR is empty and will wait for
> the
> > > >> > brokers
> > > >> > > that likely have most data to be available.
> > > >> > >
> > > >> > > On Tue, Oct 3, 2023 at 3:04 PM Colin McCabe  >
> > > >> wrote:
> > > >> > >
> > > >> > > > On Tue, Oct 3, 2023, at 10:49, Jun Rao wrote:
> > > >> > > > > Hi, Calvin,
> > > >> > > > >
> > > >> > > > > Thanks for the update KIP. A few more comments.
> > > >> > > > >
> > > >> > > > > 41. Why would a user choose the option 

Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-10-04 Thread Jun Rao
Hi, Calvin and Justine,

Historically, when we change the record format in the log, we don't support
software version downgrading.

For the record format change in the metadata log, have we thought about
forcing the write of the latest metadata records with the old version
during MV downgrading? This will in theory allow the old version of the
software to obtain the latest metadata.

Thanks,

Jun

On Wed, Oct 4, 2023 at 9:53 AM Justine Olshan 
wrote:

> Sorry -- not MV but software version.
>
> On Wed, Oct 4, 2023 at 9:51 AM Justine Olshan 
> wrote:
>
> > Catching up with this discussion.
> >
> > I was just curious -- have we had other instances where downgrading MV is
> > not supported? I think Kafka typically tries to support downgrades, and I
> > couldn't think of other examples.
> >
> > Thanks,
> > Justine
> >
> > On Wed, Oct 4, 2023 at 9:40 AM Calvin Liu 
> > wrote:
> >
> >> Hi Jun,
> >> 54. Marked the software downgrading is not supported. As the old
> >> controller
> >> will not understand the new PartitionRecord and PartitionChangeRecord.
> >> Thanks!
> >>
> >> On Wed, Oct 4, 2023 at 9:12 AM Jun Rao 
> wrote:
> >>
> >> > Hi, Calvin,
> >> >
> >> > Thanks for the reply. Just one more comment.
> >> >
> >> > 54. It seems that downgrading MV is supported. Is downgrading the
> >> software
> >> > version supported? It would be useful to document that.
> >> >
> >> > Thanks,
> >> >
> >> > Jun
> >> >
> >> > On Tue, Oct 3, 2023 at 4:55 PM Artem Livshits
> >> >  wrote:
> >> >
> >> > > Hi Colin,
> >> > >
> >> > > I think in your example "do_unclean_recovery" would need to do
> >> different
> >> > > things depending on the strategy.
> >> > >
> >> > > do_unclean_recovery() {
> >> > >if (unclean.recovery.manager.enabled) {
> >> > > if (strategy == Aggressive)
> >> > >   use UncleanRecoveryManager(waitLastKnownERL=false)  // just
> >> inspect
> >> > > logs from whoever is available
> >> > > else
> >> > >   use  UncleanRecoveryManager(waitLastKnownERL=true)  // must
> wait
> >> > for
> >> > > at least last known ELR
> >> > >   } else {
> >> > > if (strategy == Aggressive)
> >> > >   choose the last known leader if that is available, or a random
> >> > leader
> >> > > if not)
> >> > > else
> >> > >   wait for last known leader to get back
> >> > >   }
> >> > > }
> >> > >
> >> > > The idea is that the Aggressive strategy would kick in as soon as we
> >> lost
> >> > > the leader and would pick a leader from whoever is available; but
> the
> >> > > Balanced will only kick in when ELR is empty and will wait for the
> >> > brokers
> >> > > that likely have most data to be available.
> >> > >
> >> > > On Tue, Oct 3, 2023 at 3:04 PM Colin McCabe 
> >> wrote:
> >> > >
> >> > > > On Tue, Oct 3, 2023, at 10:49, Jun Rao wrote:
> >> > > > > Hi, Calvin,
> >> > > > >
> >> > > > > Thanks for the update KIP. A few more comments.
> >> > > > >
> >> > > > > 41. Why would a user choose the option to select a random
> replica
> >> as
> >> > > the
> >> > > > > leader instead of using unclean.recovery.strateg=Aggressive? It
> >> seems
> >> > > > that
> >> > > > > the latter is strictly better? If that's not the case, could we
> >> fold
> >> > > this
> >> > > > > option under unclean.recovery.strategy instead of introducing a
> >> > > separate
> >> > > > > config?
> >> > > >
> >> > > > Hi Jun,
> >> > > >
> >> > > > I thought the flow of control was:
> >> > > >
> >> > > > If there is no leader for the partition {
> >> > > >   If (there are unfenced ELR members) {
> >> > > > choose_an_unfenced_ELR_member
> >> > > >   } else if (there are fenced ELR members AND
> strategy=Aggressive) {
> >> > > > do_unclean_recovery
> >> > > >   } else if (there are no ELR members AND strategy != None) {
> >> > > > do_unclean_recovery
> >> > > >   } else {
> >> > > > do nothing about the missing leader
> >> > > >   }
> >> > > > }
> >> > > >
> >> > > > do_unclean_recovery() {
> >> > > >if (unclean.recovery.manager.enabled) {
> >> > > > use UncleanRecoveryManager
> >> > > >   } else {
> >> > > > choose the last known leader if that is available, or a random
> >> > leader
> >> > > > if not)
> >> > > >   }
> >> > > > }
> >> > > >
> >> > > > However, I think this could be clarified, especially the behavior
> >> when
> >> > > > unclean.recovery.manager.enabled=false. Inuitively the goal for
> >> > > > unclean.recovery.manager.enabled=false is to be "the same as now,
> >> > mostly"
> >> > > > but it's very underspecified in the KIP, I agree.
> >> > > >
> >> > > > >
> >> > > > > 50. ElectLeadersRequest: "If more than 20 topics are included,
> >> only
> >> > the
> >> > > > > first 20 will be served. Others will be returned with
> >> > DesiredLeaders."
> >> > > > Hmm,
> >> > > > > not sure that I understand this. ElectLeadersResponse doesn't
> >> have a
> >> > > > > DesiredLeaders field.
> >> > > > >
> >> > > > > 51. GetReplicaLogInfo: "If more than 2000 partitions are
> included,
> >> > only
> >> > > > 

Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-10-04 Thread Calvin Liu
Hi Jun,
After the MV downgrade, the controller will write in the old version of the
PartitionRecord/PartitionChangeRecord. If I understand correctly, it is
possible to downgrade the software version if the controller only has to
handle old version records.
However, the controller will not automatically rewrite the PartitionRecord
with the old version unless there is a partition update. Then, the user may
have to wait an unknown amount of time before the software downgrades
unless they do a roll to force update every partition. If it makes sense, I
can mention these steps to do a software downgrade.
Thanks

On Wed, Oct 4, 2023 at 11:20 AM Jun Rao  wrote:

> Hi, Calvin and Justine,
>
> Historically, when we change the record format in the log, we don't support
> software version downgrading.
>
> For the record format change in the metadata log, have we thought about
> forcing the write of the latest metadata records with the old version
> during MV downgrading? This will in theory allow the old version of the
> software to obtain the latest metadata.
>
> Thanks,
>
> Jun
>
> On Wed, Oct 4, 2023 at 9:53 AM Justine Olshan  >
> wrote:
>
> > Sorry -- not MV but software version.
> >
> > On Wed, Oct 4, 2023 at 9:51 AM Justine Olshan 
> > wrote:
> >
> > > Catching up with this discussion.
> > >
> > > I was just curious -- have we had other instances where downgrading MV
> is
> > > not supported? I think Kafka typically tries to support downgrades,
> and I
> > > couldn't think of other examples.
> > >
> > > Thanks,
> > > Justine
> > >
> > > On Wed, Oct 4, 2023 at 9:40 AM Calvin Liu 
> > > wrote:
> > >
> > >> Hi Jun,
> > >> 54. Marked the software downgrading is not supported. As the old
> > >> controller
> > >> will not understand the new PartitionRecord and PartitionChangeRecord.
> > >> Thanks!
> > >>
> > >> On Wed, Oct 4, 2023 at 9:12 AM Jun Rao 
> > wrote:
> > >>
> > >> > Hi, Calvin,
> > >> >
> > >> > Thanks for the reply. Just one more comment.
> > >> >
> > >> > 54. It seems that downgrading MV is supported. Is downgrading the
> > >> software
> > >> > version supported? It would be useful to document that.
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Jun
> > >> >
> > >> > On Tue, Oct 3, 2023 at 4:55 PM Artem Livshits
> > >> >  wrote:
> > >> >
> > >> > > Hi Colin,
> > >> > >
> > >> > > I think in your example "do_unclean_recovery" would need to do
> > >> different
> > >> > > things depending on the strategy.
> > >> > >
> > >> > > do_unclean_recovery() {
> > >> > >if (unclean.recovery.manager.enabled) {
> > >> > > if (strategy == Aggressive)
> > >> > >   use UncleanRecoveryManager(waitLastKnownERL=false)  // just
> > >> inspect
> > >> > > logs from whoever is available
> > >> > > else
> > >> > >   use  UncleanRecoveryManager(waitLastKnownERL=true)  // must
> > wait
> > >> > for
> > >> > > at least last known ELR
> > >> > >   } else {
> > >> > > if (strategy == Aggressive)
> > >> > >   choose the last known leader if that is available, or a
> random
> > >> > leader
> > >> > > if not)
> > >> > > else
> > >> > >   wait for last known leader to get back
> > >> > >   }
> > >> > > }
> > >> > >
> > >> > > The idea is that the Aggressive strategy would kick in as soon as
> we
> > >> lost
> > >> > > the leader and would pick a leader from whoever is available; but
> > the
> > >> > > Balanced will only kick in when ELR is empty and will wait for the
> > >> > brokers
> > >> > > that likely have most data to be available.
> > >> > >
> > >> > > On Tue, Oct 3, 2023 at 3:04 PM Colin McCabe 
> > >> wrote:
> > >> > >
> > >> > > > On Tue, Oct 3, 2023, at 10:49, Jun Rao wrote:
> > >> > > > > Hi, Calvin,
> > >> > > > >
> > >> > > > > Thanks for the update KIP. A few more comments.
> > >> > > > >
> > >> > > > > 41. Why would a user choose the option to select a random
> > replica
> > >> as
> > >> > > the
> > >> > > > > leader instead of using unclean.recovery.strateg=Aggressive?
> It
> > >> seems
> > >> > > > that
> > >> > > > > the latter is strictly better? If that's not the case, could
> we
> > >> fold
> > >> > > this
> > >> > > > > option under unclean.recovery.strategy instead of introducing
> a
> > >> > > separate
> > >> > > > > config?
> > >> > > >
> > >> > > > Hi Jun,
> > >> > > >
> > >> > > > I thought the flow of control was:
> > >> > > >
> > >> > > > If there is no leader for the partition {
> > >> > > >   If (there are unfenced ELR members) {
> > >> > > > choose_an_unfenced_ELR_member
> > >> > > >   } else if (there are fenced ELR members AND
> > strategy=Aggressive) {
> > >> > > > do_unclean_recovery
> > >> > > >   } else if (there are no ELR members AND strategy != None) {
> > >> > > > do_unclean_recovery
> > >> > > >   } else {
> > >> > > > do nothing about the missing leader
> > >> > > >   }
> > >> > > > }
> > >> > > >
> > >> > > > do_unclean_recovery() {
> > >> > > >if (unclean.recovery.manager.enabled) {
> > >> > > > use UncleanRecoveryManager
> > >> > > >   

Re: [DISCUSS] KIP-980: Allow creating connectors in a stopped state

2023-10-04 Thread Chris Egerton
Hi Yash,

Looking great! Few more thoughts:


1. (Downgraded to nit) I still prefer dot-delimitation but it's not a
blocker; thanks for addressing my concerns about the name of the field and
how it may be perceived by users.

2. (Addressed) Thanks for looking into this, and sorry it turned out to be
a bit of a dead end! I'm convinced that the current proposal is good enough.

3. Can you shed a little more light on how we'll determine whether a
connector config should be parsed as JSON or as a properties file? Will
this be based on file extension, a command-line flag (which might apply to
all configs, or individual configs), attempting to parse first as one
format then the other, something else?

4. (Addressed) Thanks! Looks great.

6. (Addressed) Awesome, great to hear. The point about laggy connector
startup is very convincing; my paranoia is satiated.


Cheers,

Chris

On Wed, Oct 4, 2023 at 5:35 AM Yash Mayya  wrote:

> Hi Chris,
>
> Thanks for the quick follow up and the continued insightful discourse!
>
> 1. Fair point on the need to differentiate it from the actual state
> displayed in the status API, I like the prefix of "initial" to make that
> differentiation (from your suggested alternatives previously). Regarding
> the dots vs underscores as delimiters - the new state field will be a top
> level field in the connector creation request body alongside the "config"
> map (i.e. it won't be a connector configuration itself), so I think we
> should be using the underscore delimiter for consistency. For now, I've
> updated the KIP to use "initial_state" as the new field's name - let me
> know if you disagree, and I'd be happy to reconsider.
>
> 2. Hm, I actually hadn't considered the downgrade implications with your
> proposed single record approach. I agree that it's a bigger downside than
> writing two records to the config topic. I do understand your concerns with
> the potential for config topic inconsistencies which is why I proposed
> writing the target state first (since the presence of a target state for a
> connector with no configuration is a benign condition). Also, even in the
> non-transactional config topic producer case - if there is a failure
> between the two writes, the user will be notified of the error
> synchronously via the API response (ref -
> https://github.com/apache/kafka/pull/12984) and will be able to safely
> retry the operation. I don't see how we'd be able to do a single record
> write approach along with supporting clean downgrades since we'd either
> need to introduce a new record type or add a new field to an existing
> record type - neither of which would be recognized as such by an older
> Connect worker.
>
> > Standalone mode has always supported the REST API,
> > and so far FWICTwe've maintained feature parity between
> > the two modes
>
> > add support for JSON files with standalone mode.
>
> 3. Thanks, I wasn't aware about standalone mode always having supported the
> full REST API - I thought I'd seen some references earlier indicating
> otherwise. In that case, I do agree that it makes sense to maintain parity
> across both methods of connector creation for user experience consistency.
> I really like the idea of updating the standalone mode CLI to be able to
> parse JSON files (in the same format as the connector creation REST API
> endpoint request body) along with Java properties files since I think that
> offers two big benefits. One is that users will be able to copy and use
> examples across both the methods of connector creation (REST API requests
> with JSON request bodies and JSON files passed to the standalone mode
> startup CLI). The second benefit is that any future extensions (such as the
> "offsets" field we've discussed in this thread) would be easily applied
> across both the methods consistently instead of introducing new (and likely
> ugly) CLI flags. I've updated the KIP to include this change in the
> standalone mode CLI.
>
> 4. Makes sense, I've added this under a new "Future Work" section in the
> KIP.
>
> 6. From what I can tell, there shouldn't be any issues with the lack of
> task configurations in the config topic and it seems to be a supported
> assumption across the Connect code-base that a connector configuration
> could exist without any task configurations for the connector (a situation
> that could currently manifest with slow starting connectors, connectors
> that fail during startup, connectors that fail to generate task
> configurations, connectors that are paused right after being created etc.).
> I did also try out a small prototype before publishing this KIP and things
> do work as expected when creating a connector in the PAUSED / STOPPED state
> by simply writing the appropriate target state along with the connector
> configuration to the config topic.
>
> Thanks,
> Yash
>
> On Wed, Oct 4, 2023 at 1:44 AM Chris Egerton 
> wrote:
>
> > Hi Yash,
> >
> > Thanks for the in-depth discussion! Continuations here:
> >
> > 1. 

Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-10-04 Thread Calvin Liu
Hi David,
Thanks for the comments.

I thought that a new snapshot with the downgraded MV is created in this
case. Isn’t it the case?
Yes, you are right, a metadata delta will be generated after the MV
downgrade. Then the user can start the software downgrade.
-
Could you also elaborate a bit more on the reasoning behind adding the
limits to the admin RPCs? This is a new pattern in Kafka so it would be
good to clear on the motivation.
Thanks to Colin for bringing it up. The current MetadataRequest does not
have a limit on the number of topics to query in a single request. Massive
requests can mess up the JVM. We want to have some sort of throttle on the
new APIs.
-
Could you also explain how the client is supposed to handle the
topics/partitions above the limit? I suppose that it will have to retry
those, correct?
Corrent. For the official admin clients, it will split the large request
into proper pieces and query one after another.
-
My understanding is that the topics/partitions above the limit will be
failed with an invalid exception error. I wonder if this choice is
judicious because the invalide request exception is usually fatal. It may
be better to use an new and explicit error for this case.

Thanks for bringing this up. How about "REQUEST_LIMIT_REACHED"?

It seems that we still need to specify the changes to the admin api to
accommodate the new or updated apis. Do you plan to add them?
Try to cover the following
1. The admin client will use the new DescribeTopicRequest to query the
topics
2. Mention the API limit and the new retriable error.
3. Output changes for the admin client when describing a topic (new fields
of ELR...)
4. Changes to data structures like TopicPartitionInfo to include the ELR.
Anything else I missed?

Thanks!





On Wed, Oct 4, 2023 at 12:27 PM David Jacot  wrote:

> Hi Calvin,
>
> I thought that a new snapshot with the downgraded MV is created in this
> case. Isn’t it the case?
>
> Could you also elaborate a bit more on the reasoning behind adding the
> limits to the admin RPCs? This is a new pattern in Kafka so it would be
> good to clear on the motivation.
>
> Could you also explain how the client is supposed to handle the
> topics/partitions above the limit? I suppose that it will have to retry
> those, correct?
>
> My understanding is that the topics/partitions above the limit will be
> failed with an invalid exception error. I wonder if this choice is
> judicious because the invalide request exception is usually fatal. It may
> be better to use an new and explicit error for this case.
>
> It seems that we still need to specify the changes to the admin api to
> accommodate the new or updated apis. Do you plan to add them?
>
> Best,
> David
>
> Le mer. 4 oct. 2023 à 20:39, Calvin Liu  a
> écrit :
>
> > Hi Jun,
> > After the MV downgrade, the controller will write in the old version of
> the
> > PartitionRecord/PartitionChangeRecord. If I understand correctly, it is
> > possible to downgrade the software version if the controller only has to
> > handle old version records.
> > However, the controller will not automatically rewrite the
> PartitionRecord
> > with the old version unless there is a partition update. Then, the user
> may
> > have to wait an unknown amount of time before the software downgrades
> > unless they do a roll to force update every partition. If it makes
> sense, I
> > can mention these steps to do a software downgrade.
> > Thanks
> >
> > On Wed, Oct 4, 2023 at 11:20 AM Jun Rao 
> wrote:
> >
> > > Hi, Calvin and Justine,
> > >
> > > Historically, when we change the record format in the log, we don't
> > support
> > > software version downgrading.
> > >
> > > For the record format change in the metadata log, have we thought about
> > > forcing the write of the latest metadata records with the old version
> > > during MV downgrading? This will in theory allow the old version of the
> > > software to obtain the latest metadata.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Wed, Oct 4, 2023 at 9:53 AM Justine Olshan
> >  > > >
> > > wrote:
> > >
> > > > Sorry -- not MV but software version.
> > > >
> > > > On Wed, Oct 4, 2023 at 9:51 AM Justine Olshan 
> > > > wrote:
> > > >
> > > > > Catching up with this discussion.
> > > > >
> > > > > I was just curious -- have we had other instances where downgrading
> > MV
> > > is
> > > > > not supported? I think Kafka typically tries to support downgrades,
> > > and I
> > > > > couldn't think of other examples.
> > > > >
> > > > > Thanks,
> > > > > Justine
> > > > >
> > > > > On Wed, Oct 4, 2023 at 9:40 AM Calvin Liu
>  > >
> > > > > wrote:
> > > > >
> > > > >> Hi Jun,
> > > > >> 54. Marked the software downgrading is not supported. As the old
> > > > >> controller
> > > > >> will not understand the new PartitionRecord and
> > PartitionChangeRecord.
> > > > >> Thanks!
> > > > >>
> > > > >> On Wed, Oct 4, 2023 at 9:12 AM Jun Rao 
> > > > wrote:
> > > > >>
> > > > >> > Hi, Calvin,
> > > > 

Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-04 Thread Hao Li
Hi Hanyu,

Thanks for the KIP! Seems there are already a lot of good discussions. I
only have two comments:

1. Please make it clear in
```
/**
 * Interactive range query using a lower and upper bound to filter the
keys returned.
 * @param lower The key that specifies the lower bound of the range
 * @param upper The key that specifies the upper bound of the range
 * @param  The key type
 * @param  The value type
 */
public static  RangeQuery withRange(final K lower, final K
upper) {
return new RangeQuery<>(Optional.ofNullable(lower),
Optional.ofNullable(upper), true);
}
```
that a `null` in lower or upper parameter means it's unbounded.
2. What's the behavior if lower is 3 and upper is 1? Is it IllegalArgument
or will this return an empty result? Maybe also clarify this in the
document.

Thanks,
Hao


On Wed, Oct 4, 2023 at 9:27 AM Hanyu (Peter) Zheng
 wrote:

> For testing purposes, we previously used a Set to record the results in
> IQv2StoreIntegrationTest. Let's take an example where we now have two
> partitions and four key-value pairs: <0,0> in p0, <1,1> in p1, <2,2> in p0,
> and <3,3> in p1.
>
> If we execute withRange(1,3), it will return a Set of <1, 2, 3>. However,
> if we run withRange(1,3).withDescendingKeys(), and still use a Set, the
> result will again be a Set of <1,2,3>. This means we won't be able to
> determine whether the results have been reversed.
>
> To resolve this ambiguity, I've switched to using a List to record the
> results, ensuring the order of retrieval from partitions p0 and p1. So,
> withRange(1,3) would yield a List of [2, 1, 3], whereas
> withRange(1,3).withDescendingKeys() would produce a List of [2,3,1].
>
> This ordering makes sense since RocksDB sorts its keys, and InMemoryStore
> uses a TreeMap structure, which means the keys are already sorted.
>
> Sincerely,
> Hanyu
>
> On Wed, Oct 4, 2023 at 9:25 AM Hanyu (Peter) Zheng 
> wrote:
>
> > Hi,  Bruno
> >
> > Thank you for your suggestions, I will update them soon.
> > Sincerely,
> >
> > Hanyu
> >
> > On Wed, Oct 4, 2023 at 9:25 AM Hanyu (Peter) Zheng 
> > wrote:
> >
> >> Hi, Lucas,
> >>
> >> Thank you for your suggestions.
> >> I will update the KIP and code together.
> >>
> >> Sincerely,
> >> Hanyu
> >>
> >> On Tue, Oct 3, 2023 at 8:16 PM Hanyu (Peter) Zheng  >
> >> wrote:
> >>
> >>> If we use  WithDescendingKeys() to generate a RangeQuery to do the
> >>> reveseQuery, how do we achieve the methods like withRange,
> withUpperBound,
> >>> and withLowerBound only in this method?
> >>>
> >>> On Tue, Oct 3, 2023 at 8:01 PM Hanyu (Peter) Zheng <
> pzh...@confluent.io>
> >>> wrote:
> >>>
>  I believe there's no need to introduce a method like
>  WithDescendingKeys(). Instead, we can simply add a reverse flag to
>  RangeQuery. Each method within RangeQuery would then accept an
> additional
>  parameter. If the reverse is set to true, it would indicate the
> results
>  should be reversed.
> 
>  Initially, I introduced a reverse variable. When set to false, the
>  RangeQuery class behaves normally. However, when reverse is set to
> true,
>  the RangeQuery essentially takes on the functionality of
> ReverseRangeQuery.
>  Further details can be found in the "Rejected Alternatives" section.
> 
>  In my perspective, RangeQuery is a class responsible for creating a
>  series of RangeQuery objects. It offers methods such as withRange,
>  withUpperBound, and withLowerBound, allowing us to generate objects
>  representing different queries. I'm unsure how adding a
>  withDescendingOrder() method would be compatible with the other
> methods,
>  especially considering that, based on KIP 969, WithDescendingKeys()
> doesn't
>  appear to take any input variables. And if withDescendingOrder()
> doesn't
>  accept any input, how does it return a RangeQuery?
> 
>  On Tue, Oct 3, 2023 at 4:37 PM Hanyu (Peter) Zheng <
> pzh...@confluent.io>
>  wrote:
> 
> > Hi, Colt,
> > The underlying structure of inMemoryKeyValueStore is treeMap.
> > Sincerely,
> > Hanyu
> >
> > On Tue, Oct 3, 2023 at 4:34 PM Hanyu (Peter) Zheng <
> > pzh...@confluent.io> wrote:
> >
> >> Hi Bill,
> >> 1. I will update the KIP in accordance with the PR and synchronize
> >> their future updates.
> >> 2. I will use that name.
> >> 3. you mean add something about ordering at the motivation section?
> >>
> >> Sincerely,
> >> Hanyu
> >>
> >>
> >> On Tue, Oct 3, 2023 at 4:29 PM Hanyu (Peter) Zheng <
> >> pzh...@confluent.io> wrote:
> >>
> >>> Hi, Walker,
> >>>
> >>> 1. I will update the KIP in accordance with the PR and synchronize
> >>> their future updates.
> >>> 2. I will use that name.
> >>> 3. I'll provide additional details in that section.
> >>> 4. I intend to utilize rangeQuery to achieve what we're referring
> to
> >>> as 

Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2257

2023-10-04 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 421971 lines...]
Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
onlyRemovePendingTaskToCloseDirtyShouldRemoveTaskFromPendingUpdateActions() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
onlyRemovePendingTaskToCloseDirtyShouldRemoveTaskFromPendingUpdateActions() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldAddAndRemovePendingTaskToSuspend() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldAddAndRemovePendingTaskToSuspend() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldVerifyIfPendingTaskToInitExist() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldVerifyIfPendingTaskToInitExist() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
onlyRemovePendingTaskToCloseCleanShouldRemoveTaskFromPendingUpdateActions() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
onlyRemovePendingTaskToCloseCleanShouldRemoveTaskFromPendingUpdateActions() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldDrainPendingTasksToCreate() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldDrainPendingTasksToCreate() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldKeepAddedTasks() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldKeepAddedTasks() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > StateQueryResultTest 
> More than one query result throws IllegalArgumentException STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > StateQueryResultTest 
> More than one query result throws IllegalArgumentException PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > StateQueryResultTest 
> Zero query results shouldn't error STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > StateQueryResultTest 
> Zero query results shouldn't error PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > StateQueryResultTest 
> Valid query results still works STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > StateQueryResultTest 
> Valid query results still works PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
RocksDBBlockCacheMetricsTest > shouldRecordCorrectBlockCacheUsage(RocksDBStore, 
StateStoreContext) > [1] 
org.apache.kafka.streams.state.internals.RocksDBStore@371bd24a, 
org.apache.kafka.test.MockInternalProcessorContext@3e54615c STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
RocksDBBlockCacheMetricsTest > shouldRecordCorrectBlockCacheUsage(RocksDBStore, 
StateStoreContext) > [1] 
org.apache.kafka.streams.state.internals.RocksDBStore@371bd24a, 
org.apache.kafka.test.MockInternalProcessorContext@3e54615c PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
RocksDBBlockCacheMetricsTest > shouldRecordCorrectBlockCacheUsage(RocksDBStore, 
StateStoreContext) > [2] 
org.apache.kafka.streams.state.internals.RocksDBTimestampedStore@6a4238a8, 
org.apache.kafka.test.MockInternalProcessorContext@6748f90 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
RocksDBBlockCacheMetricsTest > shouldRecordCorrectBlockCacheUsage(RocksDBStore, 
StateStoreContext) > [2] 
org.apache.kafka.streams.state.internals.RocksDBTimestampedStore@6a4238a8, 
org.apache.kafka.test.MockInternalProcessorContext@6748f90 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
RocksDBBlockCacheMetricsTest > 
shouldRecordCorrectBlockCachePinnedUsage(RocksDBStore, StateStoreContext) > [1] 
org.apache.kafka.streams.state.internals.RocksDBStore@2462e91d, 
org.apache.kafka.test.MockInternalProcessorContext@1c9e2517 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
RocksDBBlockCacheMetricsTest > 
shouldRecordCorrectBlockCachePinnedUsage(RocksDBStore, StateStoreContext) > [1] 

Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.6 #85

2023-10-04 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 306903 lines...]
> Task :connect:api:compileTestJava UP-TO-DATE
> Task :connect:api:testClasses UP-TO-DATE
> Task :connect:json:jar UP-TO-DATE
> Task :connect:json:generateMetadataFileForMavenJavaPublication
> Task :connect:api:testJar
> Task :connect:api:testSrcJar
> Task :clients:generateMetadataFileForMavenJavaPublication
> Task :connect:api:publishMavenJavaPublicationToMavenLocal
> Task :connect:api:publishToMavenLocal
> Task :connect:json:publishMavenJavaPublicationToMavenLocal
> Task :connect:json:publishToMavenLocal
> Task :storage:api:compileTestJava
> Task :storage:api:testClasses
> Task :server-common:compileTestJava
> Task :server-common:testClasses
> Task :raft:compileTestJava
> Task :raft:testClasses

> Task :clients:javadoc
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.6/clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java:81:
 warning - Tag @link:illegal character: "60" in "#define(String, Type, 
Importance, String, String, int, Width, String, List)"
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.6/clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java:81:
 warning - Tag @link:illegal character: "62" in "#define(String, Type, 
Importance, String, String, int, Width, String, List)"
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.6/clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java:81:
 warning - Tag @link: can't find define(String, Type, Importance, String, 
String, int, Width, String, List) in 
org.apache.kafka.common.config.ConfigDef

> Task :group-coordinator:compileTestJava
> Task :group-coordinator:testClasses
> Task :core:compileScala

> Task :clients:javadoc
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.6/clients/src/main/java/org/apache/kafka/clients/admin/ScramMechanism.java:32:
 warning - Tag @see: missing final '>': "https://cwiki.apache.org/confluence/display/KAFKA/KIP-554%3A+Add+Broker-side+SCRAM+Config+API;>KIP-554:
 Add Broker-side SCRAM Config API

 This code is duplicated in 
org.apache.kafka.common.security.scram.internals.ScramMechanism.
 The type field in both files must match and must not change. The type field
 is used both for passing ScramCredentialUpsertion and for the internal
 UserScramCredentialRecord. Do not change the type field."
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.6/clients/src/main/java/org/apache/kafka/common/security/oauthbearer/secured/package-info.java:21:
 warning - Tag @link: reference not found: 
org.apache.kafka.common.security.oauthbearer
5 warnings

> Task :clients:javadocJar
> Task :metadata:compileTestJava
> Task :metadata:testClasses
> Task :clients:srcJar
> Task :clients:testJar
> Task :clients:testSrcJar
> Task :clients:publishMavenJavaPublicationToMavenLocal
> Task :clients:publishToMavenLocal
> Task :core:classes
> Task :core:compileTestJava NO-SOURCE
> Task :core:compileTestScala
> Task :core:testClasses
> Task :streams:compileTestJava
> Task :streams:testClasses
> Task :streams:testJar
> Task :streams:testSrcJar
> Task :streams:publishMavenJavaPublicationToMavenLocal
> Task :streams:publishToMavenLocal

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

For more on this, please refer to 
https://docs.gradle.org/8.2.1/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.

BUILD SUCCESSFUL in 5m 22s
94 actionable tasks: 41 executed, 53 up-to-date

Publishing build scan...
https://ge.apache.org/s/7ltoz5qspq32u

[Pipeline] sh
+ grep ^version= gradle.properties
+ cut -d= -f 2
[Pipeline] dir
Running in 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.6/streams/quickstart
[Pipeline] {
[Pipeline] sh
+ mvn clean install -Dgpg.skip
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Kafka Streams :: Quickstart[pom]
[INFO] streams-quickstart-java[maven-archetype]
[INFO] 
[INFO] < org.apache.kafka:streams-quickstart >-
[INFO] Building Kafka Streams :: Quickstart 3.6.1-SNAPSHOT[1/2]
[INFO]   from pom.xml
[INFO] [ pom ]-
[INFO] 
[INFO] --- clean:3.0.0:clean (default-clean) @ streams-quickstart ---
[INFO] 
[INFO] --- remote-resources:1.5:process (process-resource-bundles) @ 
streams-quickstart ---
[INFO] 
[INFO] --- site:3.5.1:attach-descriptor (attach-descriptor) @ 
streams-quickstart ---
[INFO] 
[INFO] --- gpg:1.6:sign (sign-artifacts) @ streams-quickstart ---
[INFO] 
[INFO] --- install:2.5.2:install (default-install) 

[jira] [Resolved] (KAFKA-15547) Thread leak in MirrorMakerConfigTest#testClientConfigProperties

2023-10-04 Thread Yash Mayya (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yash Mayya resolved KAFKA-15547.

Fix Version/s: 3.7.0
   Resolution: Fixed

> Thread leak in MirrorMakerConfigTest#testClientConfigProperties
> ---
>
> Key: KAFKA-15547
> URL: https://issues.apache.org/jira/browse/KAFKA-15547
> Project: Kafka
>  Issue Type: Bug
>Reporter: Kalpesh Patel
>Assignee: Kalpesh Patel
>Priority: Minor
> Fix For: 3.7.0
>
>
> The test MirrorMakerConfigTest#testClientConfigProperties opens a 
> ForwardingAdmin but fails to close it.
> we should enclose this in a try-with-resources statement to ensure the Admin 
> client is closed and there is no thread leak



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-04 Thread Matthias J. Sax

Great discussion!

It seems the only open question might be about ordering guarantees? 
IIRC, we had a discussion about this in the past.



Technically (at least from my POV), existing `RangeQuery` does not have 
a guarantee that data is return in any specific order (not even on a per 
partitions bases). It just happens that RocksDB (and as pointed out by 
Hanyu already, also the built-in in-memory store that is base on a 
tree-map) allows us to return data ordered by key; as mentioned already, 
this guarantee is limited on a per partition basis.


If there would be custom store base on a hashed key-value store, this 
store could implement RangeQuery and return data (even for a single 
partition) with no ordering, without violating the contract.




Thus, it could actually make sense, to extend `RangeQuery` and allow 
three options: no-order, ascending, descending. For our existing 
Rocks/InMemory implementations, no-order could be equal to ascending and 
nothing changes effectively, but it might be a better API contract? -- 
If we assume that there might be a custom hash-based store, such a store 
could reject a query if "ascending" is required, or might need to do 
more work to implement it (up to the store maintainer). This is actually 
the beauty of IQv2 that different stores can pick what queries they want 
to support.


From an API contract point of view, it seems confusing to say: 
specifying nothing means no guarantee (or ascending if the store can 
offer it), but descending can we explicitly request. Thus, a hash-based 
store, might be able to accept "order not specified query", but would 
reject "descending". This seems to be somewhat unbalanced?


Thus, I am wondering if we should actually add `withAscendingKeys()`, 
too, even if it won't impact our current RocksDB/In-Memory implementations?



The second question is about per-partition or across-partition ordering: 
it's not possible right now to actually offer across-partition ordering 
the way IQv2 is setup. The reason is, that the store that implements a 
query type, is always a single shard. Thus, the implementation does not 
have access to other shards. It's hard-coded inside Kafka Streams, to 
query each shared, and to "accumulate" partial results, and return the 
back to the user. Note that the API is:




StateQueryResult result = KafkaStreams.query(...);
Map> resultPerPartitions = result.getPartitionResults();



Thus, if we would want to offer across-partition ordering, we cannot do 
it right now, because Kafka Streams does not know anything about the 
semantics of the query it distributes... -- the result is an unknown 
type . We would need to extend IQv2 with an additional mechanism, 
that allows users to plug in more custom code to "merge" multiple 
partitions result into a "global result". This is clearly out-of-scope 
for this KIP and would require a new KIP by itself.


I seems that this contract, which is independent of the query type is 
not well understood, and thus a big +1 to fix the documentation. I don't 
think that this KIP must "define" anything, but it might of course be 
worth to add the explanation why the KIP cannot even offer 
global-ordering, as it's defined/limited by the IQv2 "framework" itself, 
not the individual queries.




-Matthias




On 10/4/23 4:38 PM, Hao Li wrote:

Hi Hanyu,

Thanks for the KIP! Seems there are already a lot of good discussions. I
only have two comments:

1. Please make it clear in
```
 /**
  * Interactive range query using a lower and upper bound to filter the
keys returned.
  * @param lower The key that specifies the lower bound of the range
  * @param upper The key that specifies the upper bound of the range
  * @param  The key type
  * @param  The value type
  */
 public static  RangeQuery withRange(final K lower, final K
upper) {
 return new RangeQuery<>(Optional.ofNullable(lower),
Optional.ofNullable(upper), true);
 }
```
that a `null` in lower or upper parameter means it's unbounded.
2. What's the behavior if lower is 3 and upper is 1? Is it IllegalArgument
or will this return an empty result? Maybe also clarify this in the
document.

Thanks,
Hao


On Wed, Oct 4, 2023 at 9:27 AM Hanyu (Peter) Zheng
 wrote:


For testing purposes, we previously used a Set to record the results in
IQv2StoreIntegrationTest. Let's take an example where we now have two
partitions and four key-value pairs: <0,0> in p0, <1,1> in p1, <2,2> in p0,
and <3,3> in p1.

If we execute withRange(1,3), it will return a Set of <1, 2, 3>. However,
if we run withRange(1,3).withDescendingKeys(), and still use a Set, the
result will again be a Set of <1,2,3>. This means we won't be able to
determine whether the results have been reversed.

To resolve this ambiguity, I've switched to using a List to record the
results, ensuring the order of retrieval from partitions p0 and p1. So,
withRange(1,3) would yield a List of [2, 1, 3], whereas
withRange(1,3).withDescendingKeys() 

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-10-04 Thread Matthias J. Sax

I agree to (almost) everything what Bruno said.



In general, we tend to move away from using getters without "get", recently. So I would 
keep the "get".


This is new to me? Can you elaborate on this point? Why do you think 
that's the case?


I actually did realize (after Walker mentioned it) that existing query 
types use `get` prefix, but to me it seems that it was by accident and 
we should consider correcting it? Thus, I would actually prefer to not 
add the `get` prefix for new methods query types.


IMHO, we should do a follow up KIP to deprecate all methods with `get` 
prefix and replace them with new ones without `get` -- it's of course 
always kinda "unnecessary" noise, but if we don't do it, we might get 
into more and more inconsistent naming what would result in a "bad" API.


If we indeed want to change the convention and use the `get` prefix, I 
would strongly advocate to bit the bullet and do KIP to pro-actively add 
the `get` "everywhere" it's missing... But overall, it seems to be a 
much broader decision, and we should get buy in from many committers 
about it -- as long as there is no broad consensus to add `get` 
everywhere, I would strongly prefer not to diverge from the current 
agreement to omit `get`.




-Matthias




On 10/4/23 2:36 AM, Bruno Cadonna wrote:

Hi,

Regarding tombstones:
As far as I understand, we need to add either a validTo field to 
VersionedRecord or we need to return tombstones, otherwise the result is 
not complete, because users could never know a record was deleted at 
some point before the second non-null value was put.
I like more adding the validTo field since it makes the result more 
concise and easier interpretable.


Extending on Victoria's example, with the following puts

put(k, v1, time=0)
put(k, null, time=5)
put(k, null, time=10)
put(k, null, time=15)
put(k, v2, time=20)

the result with tombstones would be

value, timestamp
(v1, 0)
(null, 5)
(null, 10)
(null, 15)
(v2, 20)

instead of

value, timestamp, validTo
(v1, 0, 5)
(v2, 20, null)

The benefit of conciseness would already apply to one single tombstone.

On the other hand, why would somebody write consecutive tombstones into 
a versioned state store? I guess if somebody does that on purpose, then 
there should be a way to retrieve each of those tombstones, right?
So maybe we need both -- validTo field and the option to return 
tombstones. The latter might be moved to a future KIP in case we see the 
need.



Regarding .within(fromTs, toTs):
I would keep it simple with .from() and .asOfTimestamp() (or .until()). 
If we go with .within(), I would opt for .withinTimeRange(fromTs, toTs), 
because the query becomes more readable:


MultiVersionedKeyQuery
   .withKey(1)
   .withinTimeRange(Instant.parse(2023-08-03T10:37:30.00Z), 
Instant.parse(2023-08-04T10:37:30.00Z))


If we stay with .from() and .until(), we should consider .fromTime() and 
.untilTime() (or .toTime()):


MultiVersionedKeyQuery
  .withKey(1)
  .fromTime(Instant.parse(2023-08-03T10:37:30.00Z))
  .untilTime(Instant.parse(2023-08-04T10:37:30.00Z))



Regarding asOf vs. until:
I think asOf() is more used in point in time queries as Walker mentioned 
where this KIP specifies a time range. IMO asOf() fits very well with 
KIP-960 where one version is queried, but here I think .until() fits 
better. That might just be a matter of taste and in the end I am fine 
with both as long as it is well documented.



Regarding getters without "get":
In the other IQv2 classes we used getters with "get". In general, we 
tend to move away from using getters without "get", recently. So I would 
keep the "get".



Best,
Bruno

On 10/3/23 7:49 PM, Walker Carlson wrote:

Hey Alieh thanks for the KIP,

Weighing in on the AsOf vs Until debate I think either is fine from a
natural language perspective. Personally AsOf makes more sense to me 
where

until gives me the idea that the query is making a change. It's totally a
connotative difference and not that important. I think as of is pretty
frequently used in point of time queries.

Also for these methods it makes sense to drop the "get" We don't
normally use that in getters

    * The key that was specified for this query.
    */
   public K getKey();

   /**
    * The starting time point of the query, if specified
    */
   public Optional getFromTimestamp();

   /**
    * The ending time point of the query, if specified
    */
   public Optional getAsOfTimestamp();

Other than that I didn't have too much to add. Overall I like the 
direction

of the KIP and think the funcatinlyt is all there!
best,
Walker



On Mon, Oct 2, 2023 at 10:46 PM Matthias J. Sax  wrote:


Thanks for the updated KIP. Overall I like it.

Victoria raises a very good point, and I personally tend to prefer (I
believe so does Victoria, but it's not totally clear from her email) if
a range query would not return any tombstones, ie, only two records in
Victoria's example. Thus, it seems best to include a `validTo` ts-field
to 

RE: [DISCUSS] KIP-939: Support Participation in 2PC

2023-10-04 Thread Raman Verma
Hello Artem,

Now that `InitProducerIdRequest` will have an extra parameter (enable2PC),
can the client change the value of this parameter during an ongoing
transaction.

Here is how the transaction coordinator responds to InitProducerId requests
according
to the current transaction's state.

- Empty | CompleteAbort | CompleteCommit
Bump epoch and move to Empty state. Accept any changes from incoming
InitProducerId
request like transactionTimeoutMs

- Ongoing
Bump epoch and move to PrepareEpochFence state. Transaction time out is not
changed.

- PrepareAbort | PrepareCommit
No changes internally. Return Concurrent transactions error to the client.

I guess we should allow the same behavior for mutating enable2PC flag
under these conditions as for transaction timeout value.


Re: [VOTE] KIP-858: Handle JBOD broker disk failure in KRaft

2023-10-04 Thread Igor Soarez
Hi everyone,

Earlier today Colin, Ron, Proven and I had a chat about this work.
We discussed several aspects which I’d like to share here.

## A new reserved UUID

We'll reserve a third UUID to indicate an unspecified dir,
but one that is known to be selected. As opposed to the
default UNKNOWN_DIR (ZERO_UUID) which is used for new replicas,
which may or may not have been placed in a some directory,
this new UUID can disambiguate transition scenarios where
we previously did not have any directory assignment
information — e.g. single-logdir KRaft into JBOD KRaft,
or ZK JBOD to KRaft JBOD mode.

Unless anyone has a better suggestion for naming or any objection,
I'll update the KIP to designate the following reserved UUIDs:

  * UNKNOWN_DIR  - new Uuid(0L, 0L)
  * OFFLINE_DIR  - new Uuid(0L, 1L)
  * SELECTED_DIR - new Uuid(0L, 2L) <-- new

When transitioning to the directory assignment feature,
without any previous directory assignment state, the
controller can assign all existing partitions in the broker
to SELECTED_DIR, to distinguish them from new partitions.

When a log directory is offline, it is important that the
broker does not replace the offline partitions in the remaining
online directories. So, if some directory is offline, and
some partition is missing the directory assignment, then
it is important to distinguish new partitions from old ones.
Old partitions may already exist in the offline dirs, but
new partitions can safely be placed in the available (online) dirs.
In ZK mode, the `isNew` flag in the LeaderAndIsr request
serves this purpose. And for KRaft this KIP proposes keeping
the broker in fenced mode until all initial assignments are
known. But this additional UUID serves as an additional
signal and covers a gap in the ZK->KRaft migration stage,
where ZK brokers do not support fencing state.

SELECTED_DIR is always a temporary transition state, which
is due to be resolved by the broker. When catching up with
metadata, for any partitions associated with SELECTED_DIR:

  * If the partition is found in some directory, AssignReplicasToDirs
  is used to correct the assignment to the actual dir.

  * If the partition is not found, and no directory is offline,
  a directory is selected, and AssignReplicasToDirs is used to
  correct the assignment to the chosen directory.

  * If the partition is not found and some directory is offline,
  the broker assumes that the partition must be in one of the
  offline dirs and AssignReplicasToDirs is used to converge the state
  to OFFLINE_DIR.

This contrasts with UNKNOWN_DIR, for which brokers always select
a directory, regardless of the online/offline state of any log dirs.

## Reserving a pool of non-designated UUIDs for future use

It’s probably a good idea to reserve a bunch of UUIDs
for future use in directory assignments. The decision
is essentially costless right now, and it may prove to
be useful in the future. The first 100 UUIDs (including
the 3 already designated above) will be reserved for future use.

## Dir failures during ZK->KRaft migration

The KRaft controller ZK compatibility controller functionality
does not currently implement dir failure handling. So we need
to select a strategy to deal with dir failures during the migration.

We discussed different options:

  a) If a migrating ZK mode broker encounters a directory failure,
  it will shutdown. While this degrades failure handling during,
  the temporary migration window, it is a useful simplification.
  This is an attractive option, and it isn't ruled out, but it
  is also not clear that it is necessary at this point.

  b) Extending the ZK Controller compatibility functionality in
  KRaft controllers to watch the /log_dir_event_notification
  znode, and rely on LeaderAndIsr requests for dir failure handling,
  same as ZK Controllers do. As there is a desire to limit the scope
  of the compatibility functionality, this option looks less attractive.

  c) Extend the ZK mode broker functionality during the migration
  to both send AssignReplicasToDirs populating the assignment state
  earlier, and propagate dir failures in the heartbeat in the same
  way the KIP proposes regular KRaft brokers do as well.
  There are several phases in the ZK->KRaft migration, and part of
  the process requires ZK mode brokers to send BrokerRegistration
  and BrokerHeartbeat requests already, so this doesn't look like
  a big change, and seems to be the preferred option.

If you have any thoughts or questions on any of these matters,
please let me know.

Best,

--
Igor


Re: [VOTE]KIP-966: Eligible Leader Replicas

2023-10-04 Thread Jun Rao
Hi, Calvin,

Thanks for the KIP. +1 from me too.

Jun

On Wed, Sep 20, 2023 at 5:28 PM Justine Olshan 
wrote:

> Thanks Calvin.
> I think this will be very helpful going forward to minimize data loss.
>
> +1 from me (binding)
>
> Justine
>
> On Wed, Sep 20, 2023 at 3:42 PM Calvin Liu 
> wrote:
>
> > Hi all,
> > I'd like to call for a vote on KIP-966 which includes a series of
> > enhancements to the current ISR model.
> >
> >- Introduce the new HWM advancement requirement which enables the
> system
> >to have more potentially data-safe replicas.
> >- Introduce Eligible Leader Replicas(ELR) to represent the above
> >data-safe replicas.
> >- Introduce Unclean Recovery process which will deterministically
> choose
> >the best replica during an unclean leader election.
> >
> >
> > KIP:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas
> >
> > Discussion thread:
> > https://lists.apache.org/thread/gpbpx9kpd7c62dm962h6kww0ghgznb38
> >
>


Re: [DISCUSS] KIP-979: Allow independently stop KRaft controllers or brokers

2023-10-04 Thread Hailey Ni
Hi Federico,

Thanks for your comments. SureI will update the KIP accordingly.

Thanks,
Hailey

On Tue, Oct 3, 2023 at 1:29 AM Federico Valeri  wrote:

> Hi Hailey, thanks for the KIP.
>
> I also agree that the two mutually exclusive args are better. In order
> to be consistent with the other tools, I would suggest to use
> --process-role and --node-id (hyphen instead of dot). Can you also
> update the KIP?
>
> On Mon, Oct 2, 2023 at 10:18 PM Hailey Ni 
> wrote:
> >
> > Hi Kamal,
> >
> > I think the broker.id property has been replaced with the `node.id`
> property
> > in KRaft.  The documentation for `node.id` says it is required (
> >
> https://github.com/apache/kafka/blob/72e275f6ea867747e6b4e524c80d5ebd726ac25b/core/src/main/scala/kafka/server/KafkaConfig.scala#L741
> ),
> > and the QuickStart files all use it (
> >
> https://github.com/apache/kafka/tree/72e275f6ea867747e6b4e524c80d5ebd726ac25b/config/kraft
> ).
> > It is technically true that these two configs are treated as synonyms of
> > one another (
> >
> https://github.com/apache/kafka/blob/72e275f6ea867747e6b4e524c80d5ebd726ac25b/core/src/main/scala/kafka/server/KafkaConfig.scala#L1587-L1597
> ),
> > so if you specify either one the process will still recognize it and
> > start.  But it makes sense to exclusively use `node.id` in KRaft
> because a
> > node isn't necessarily a broker anymore; it could be a controller (or
> even
> > a combined broker+controller).
> >
> > Thanks,
> > Hailey
> >
> > On Mon, Oct 2, 2023 at 1:17 PM Hailey Ni  wrote:
> >
> > > Hi Ismeal,
> > >
> > > Thanks for the comments. I'll change the implementation to use a pair
> of
> > > mutually exclusive args --process.roles and --node.id.
> > >
> > > Thanks,
> > > Hailey
> > >
> > > On Mon, Oct 2, 2023 at 6:34 AM Ismael Juma  wrote:
> > >
> > >> Hi Ron,
> > >>
> > >> Yes, that's what I am proposing, yes.
> > >>
> > >> Ismael
> > >>
> > >> On Sat, Sep 30, 2023 at 2:30 PM Ron Dagostino 
> wrote:
> > >>
> > >> > Thanks, Ismael.  I think you are proposing a pair of mutually
> exclusive
> > >> > args --process.roles and --node.id, right?  I agree that is more
> > >> > user-friendly than the --required-config arg, and it comes at the
> > >> possible
> > >> > expense of generality.  So that’s the tradeoff between the two, I
> think.
> > >> > No other config comes to mind now that we’ve identified these two.
> I
> > >> think
> > >> > the two specific and mutually exclusive parameters would be the way
> to
> > >> go
> > >> > unless someone else identifies still more options that people might
> > >> want.
> > >> >
> > >> > Did I get that right, or were you proposing something different?
> > >> >
> > >> > Ron
> > >> >
> > >> > > On Sep 30, 2023, at 10:42 AM, Ismael Juma 
> wrote:
> > >> > >
> > >> > > Hi,
> > >> > >
> > >> > > Thanks for the KIP. I think this approach based on configs is a
> bit
> > >> too
> > >> > > open ended and not very user friendly. Why don't we simply provide
> > >> flags
> > >> > > for the things a user may care about? So far, it seems like we
> have
> > >> two
> > >> > > good candidates (node id and process role). Are there any others?
> > >> > >
> > >> > > Ismael
> > >> > >
> > >> > >> On Fri, Sep 29, 2023 at 6:19 PM Hailey Ni
> 
> > >> > wrote:
> > >> > >>
> > >> > >> Hi Ron,
> > >> > >>
> > >> > >> I think you made a great point, making the "name" arbitrary
> instead
> > >> of
> > >> > >> hard-coding it will make the functionality much more flexible.
> I've
> > >> > updated
> > >> > >> the KIP and the code accordingly. Thanks for the great idea!
> > >> > >>
> > >> > >> Thanks,
> > >> > >> Hailey
> > >> > >>
> > >> > >>
> > >> > >>> On Fri, Sep 29, 2023 at 2:34 PM Ron Dagostino <
> rndg...@gmail.com>
> > >> > wrote:
> > >> > >>>
> > >> > >>> Thanks, Hailey.  Is there a reason to restrict it to just
> > >> > >>> process.roles and node.id?  Someone might want to do
> > >> > >>> "--required-config any.name=whatever.value", for example, and
> at
> > >> first
> > >> > >>> glance I don't see a reason why the implementation should be any
> > >> > >>> different -- it seems it would probably be easier to not have to
> > >> worry
> > >> > >>> about restricting to specific cases, actually.  WDYT?
> > >> > >>>
> > >> > >>> Ron
> > >> > >>>
> > >> > >>> On Fri, Sep 29, 2023 at 5:12 PM Hailey Ni
>  > >> >
> > >> > >>> wrote:
> > >> > 
> > >> >  Updated. Please let me know if you have any additional
> comments.
> > >> Thank
> > >> > >>> you!
> > >> > 
> > >> >  On Thu, Sep 21, 2023 at 3:02 PM Hailey Ni 
> > >> wrote:
> > >> > 
> > >> > > Hi Ron. Thanks for the response. I agree with your point. I'll
> > >> make
> > >> > >> the
> > >> > > corresponding changes in the KIP and KAFKA-15471
> > >> > > .
> > >> > >
> > >> > > On Thu, Sep 21, 2023 at 1:40 PM Ron Dagostino <
> rndg...@gmail.com>
> > >> > >>> wrote:
> > >> > >
> > >> > >> Hi Hailey.  No, I just looked, and 

[jira] [Created] (KAFKA-15548) Handling close() properly

2023-10-04 Thread Philip Nee (Jira)
Philip Nee created KAFKA-15548:
--

 Summary: Handling close() properly
 Key: KAFKA-15548
 URL: https://issues.apache.org/jira/browse/KAFKA-15548
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Reporter: Philip Nee
Assignee: Philip Nee


Upon closing we need to:
 # Complete pending commits
 # Auto-commit if needed
 # Send the last GroupConsumerHeartbeatRequest with epoch = -1 to leave the 
group
 # poll the NetworkClient to complete pending I/O



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2256

2023-10-04 Thread Apache Jenkins Server
See