Re: KIP-919: Allow AdminClient to Talk Directly with the KRaft Controller Quorum

2023-04-19 Thread Philip Nee
Hey Colin,

I still need to finish reading and understanding the KIP, but I have a
couple of comments despite being ignorant of most of the KRaft stuff.
(Sorry!)

Firstly, does it make sense to create an extension of the current
AdminClient only to handle these specific KRaft use cases? It seems
cumbersome to have two sets of bootstrap configurations to make the
AdminClient generic enough to handle these specific cases, instead, maybe
it is more obvious (to me) to just extend the AdminClient. What I'm
thinking is KraftAdminClient which continuously uses *bootstrap.servers*,
but make this class only serves the Kraft controllers APIs.

Secondly, if we want to continue with the design, I'm not yet sure why we
can't continue using the *bootstrap.servers*? I assume when the client gets
the metadata, it should know who it is talking to. I'm just reconsidering
your alternative again.

A bad idea: Why don't we continue using *bootstrap.servers* but have a
separated config like *kraft.controller* = true/false. I feel like most
users might not know what is a controller and causes some mistakes down the
road.

Thanks,
P

On Wed, Apr 19, 2023 at 2:18 PM Colin McCabe  wrote:

> Hi all,
>
> I wrote a short KIP about allowing AdminClient to talk directly with the
> KRaft controller quorum. Check it out here:
>
> https://cwiki.apache.org/confluence/x/Owo0Dw
>
> best,
> Colin
>


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #1780

2023-04-19 Thread Apache Jenkins Server
See 




Re: KIP-919: Allow AdminClient to Talk Directly with the KRaft Controller Quorum

2023-04-19 Thread ziming deng
Hello Colin,
There is a mistake that we use `—bootstrap-server` instead of 
`—bootstrap-server(s)`, so should we also change the new argument 
`—bootstrap-controller` (no s).

--
Ziming

> On Apr 20, 2023, at 05:17, Colin McCabe  wrote:
> 
> Hi all,
> 
> I wrote a short KIP about allowing AdminClient to talk directly with the 
> KRaft controller quorum. Check it out here:
> 
> https://cwiki.apache.org/confluence/x/Owo0Dw
> 
> best,
> Colin



Re: [DISCUSS] KIP-909: Allow clients to rebootstrap DNS lookup failure

2023-04-19 Thread Philip Nee
Hey Jason,

Thanks for your review.  I think if we make it a retriable error, does it
make sense to have a configurable timeout still? as we expect the user to
continue to retry anyway.

I'm considering the case of bad configuration. If the user retries the
error, then we rely on the error/warning to alert the user.  In this case,
maybe we continue using the proposed behavior, i.e. warn on each poll after
the timeout period.

If you agree that a configuration is needed, maybe we can call this
*bootstrap.auto.retry.ms
 *instead, to indicate a configurable
period of automatic retry. What do you think?

Cheers,
P

On Wed, Apr 19, 2023 at 7:17 PM Jason Gustafson 
wrote:

> Hey Phillip,
>
> The KIP looks good. 5 minutes seems like a reasonable tradeoff. I do wonder
> if it is necessary to treat bootstrap timeout as a fatal error though. It
> seems possible that the exception might be caught by handlers in existing
> applications which may not expect that the client needs to be restarted.
> Perhaps it would be safer to make it retriable? As long as the application
> continues trying to use the client, we could continue trying to reach the
> bootstrap servers perhaps? That would be closer to behavior today which
> only treats DNS resolution failures as fatal. What do you think?
>
> Best,
> Jason
>
> On Mon, Apr 10, 2023 at 1:53 PM Philip Nee  wrote:
>
> > Thanks, everyone: I'm starting a vote today.  Here's the recap for some
> of
> > the questions:
> >
> > John: I changed the proposal to throw a non-retriable exception after the
> > timeout elapses. I feel it might be necessary to poison the client after
> > retry expires, as it might indicate a real issue.
> > Ismael: The proposal is to add a configuration for the retry and it will
> > throw a non-retriable exception after the time expires.
> > Chris: Addressed some unclarity that you mentioned, and a new API won't
> be
> > introduced in this KIP.  Maybe up for future discussion.
> > Jason: I'm proposing adding a timeout config and a bootstrap exception
> per
> > your suggestion.
> > Kirk: I'm proposing throwing a non-retriable exception in the network
> > client. See previous comment.
> >
> >
> > On Mon, Feb 27, 2023 at 9:36 AM Chris Egerton 
> > wrote:
> >
> > > Hi Philip,
> > >
> > > Yeah,  "DNS resolution should occur..." seems like a better fit. 
> > >
> > > One other question I have is whether we should expose some kind of
> public
> > > API for performing preflight validation of the bootstrap URLs. If we
> > change
> > > the behavior of a client configured with a silly typo (e.g.,
> > > "loclahost instead of localhost") from failing in the constructor to
> > > failing with a retriable exception, this might lead some client
> > > applications to handle that failure by, well, retrying. For reference,
> > this
> > > is exactly what we do in Kafka Connect right now; see [1] and [2]. IMO
> > it'd
> > > be nice to be able to opt into keeping the current behavior so that
> > > projects like Connect could still do preflight checks of the
> > > bootstrap.servers property for connectors before starting them, and
> > report
> > > any issues by failing fast instead of continuously writing
> warning/error
> > > messages to their logs.
> > >
> > > I'm not sure about where this new API could go, but a few options might
> > be:
> > >
> > > - Expose a public variant of the existing ClientUtils class
> > > - Add static methods to the ConsumerConfig, ProducerConfig, and
> > > AdminClientConfig classes
> > > - Add those same static methods to the KafkaConsumer, KafkaProducer,
> and
> > > KafkaAdminClient classes
> > >
> > > If this seems reasonable, we should probably also specify in the KIP
> that
> > > Kafka Connect will leverage this preflight validation logic before
> > > instantiating any Kafka clients for use by connectors or tasks, and
> > > continue to fail fast if there are typos in the bootstrap.servers
> > property,
> > > or if temporary DNS resolution issues come up.
> > >
> > > [1] -
> > >
> > >
> >
> https://github.com/apache/kafka/blob/5f9d01668cae64b2cacd7872d82964fa78862aaf/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L606
> > > [2] -
> > >
> > >
> >
> https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/AbstractWorkerSourceTask.java#L439
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Fri, Feb 24, 2023 at 4:59 PM Philip Nee 
> wrote:
> > >
> > > > Hey Chris,
> > > >
> > > > Thanks for the quick response, and I apologize for the unclear
> wording
> > > > there, I guess "DNS lookup" would be a more appropriate wording here.
> > So
> > > > what I meant there was, to delegate the DNS lookup in the constructor
> > to
> > > > the network client poll, and it will happen on the very first poll.
> I
> > > > guess the logic could look like this:
> > > >
> > > > - if the client has been bootstrapped, do nothing.
> > > > - Otherwise, 

Re: [DISCUSS] KIP-909: Allow clients to rebootstrap DNS lookup failure

2023-04-19 Thread Jason Gustafson
Hey Phillip,

The KIP looks good. 5 minutes seems like a reasonable tradeoff. I do wonder
if it is necessary to treat bootstrap timeout as a fatal error though. It
seems possible that the exception might be caught by handlers in existing
applications which may not expect that the client needs to be restarted.
Perhaps it would be safer to make it retriable? As long as the application
continues trying to use the client, we could continue trying to reach the
bootstrap servers perhaps? That would be closer to behavior today which
only treats DNS resolution failures as fatal. What do you think?

Best,
Jason

On Mon, Apr 10, 2023 at 1:53 PM Philip Nee  wrote:

> Thanks, everyone: I'm starting a vote today.  Here's the recap for some of
> the questions:
>
> John: I changed the proposal to throw a non-retriable exception after the
> timeout elapses. I feel it might be necessary to poison the client after
> retry expires, as it might indicate a real issue.
> Ismael: The proposal is to add a configuration for the retry and it will
> throw a non-retriable exception after the time expires.
> Chris: Addressed some unclarity that you mentioned, and a new API won't be
> introduced in this KIP.  Maybe up for future discussion.
> Jason: I'm proposing adding a timeout config and a bootstrap exception per
> your suggestion.
> Kirk: I'm proposing throwing a non-retriable exception in the network
> client. See previous comment.
>
>
> On Mon, Feb 27, 2023 at 9:36 AM Chris Egerton 
> wrote:
>
> > Hi Philip,
> >
> > Yeah,  "DNS resolution should occur..." seems like a better fit. 
> >
> > One other question I have is whether we should expose some kind of public
> > API for performing preflight validation of the bootstrap URLs. If we
> change
> > the behavior of a client configured with a silly typo (e.g.,
> > "loclahost instead of localhost") from failing in the constructor to
> > failing with a retriable exception, this might lead some client
> > applications to handle that failure by, well, retrying. For reference,
> this
> > is exactly what we do in Kafka Connect right now; see [1] and [2]. IMO
> it'd
> > be nice to be able to opt into keeping the current behavior so that
> > projects like Connect could still do preflight checks of the
> > bootstrap.servers property for connectors before starting them, and
> report
> > any issues by failing fast instead of continuously writing warning/error
> > messages to their logs.
> >
> > I'm not sure about where this new API could go, but a few options might
> be:
> >
> > - Expose a public variant of the existing ClientUtils class
> > - Add static methods to the ConsumerConfig, ProducerConfig, and
> > AdminClientConfig classes
> > - Add those same static methods to the KafkaConsumer, KafkaProducer, and
> > KafkaAdminClient classes
> >
> > If this seems reasonable, we should probably also specify in the KIP that
> > Kafka Connect will leverage this preflight validation logic before
> > instantiating any Kafka clients for use by connectors or tasks, and
> > continue to fail fast if there are typos in the bootstrap.servers
> property,
> > or if temporary DNS resolution issues come up.
> >
> > [1] -
> >
> >
> https://github.com/apache/kafka/blob/5f9d01668cae64b2cacd7872d82964fa78862aaf/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L606
> > [2] -
> >
> >
> https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/AbstractWorkerSourceTask.java#L439
> >
> > Cheers,
> >
> > Chris
> >
> > On Fri, Feb 24, 2023 at 4:59 PM Philip Nee  wrote:
> >
> > > Hey Chris,
> > >
> > > Thanks for the quick response, and I apologize for the unclear wording
> > > there, I guess "DNS lookup" would be a more appropriate wording here.
> So
> > > what I meant there was, to delegate the DNS lookup in the constructor
> to
> > > the network client poll, and it will happen on the very first poll.  I
> > > guess the logic could look like this:
> > >
> > > - if the client has been bootstrapped, do nothing.
> > > - Otherwise, perform DNS lookup, and acquire the bootstrap server
> > address.
> > >
> > > Thanks for the comment there, I'll change up the wording.  Maybe revise
> > it
> > > as "DNS resolution should occur in the poll" ?
> > >
> > > P
> > >
> > > On Fri, Feb 24, 2023 at 1:47 PM Chris Egerton  >
> > > wrote:
> > >
> > > > Hi Philip,
> > > >
> > > > Thanks for the KIP!
> > > >
> > > > QQ: In the "Proposed Changes" section, the KIP states that
> > "Bootstrapping
> > > > should now occur in the poll method before attempting to update the
> > > > metadata. This includes resolving the addresses and bootstrapping the
> > > > metadata.". By "bootstrapping the metadata" do we mean actually
> > > contacting
> > > > the bootstrap servers, or just setting some internal state related to
> > the
> > > > current set of servers that can be contacted for metadata? I ask
> > because
> > > it
> > > > seems like the language here implies the 

Re: [DISCUSS] Re-visit end of life policy

2023-04-19 Thread Matthias J. Sax

While I understand the desire, I tend to agree with Ismael.

In general, it's a significant amount of work not just to do the actual 
releases, but also the cherry-pick bug-fixed to older branches. Code 
diverges very quickly, and a clean cherry-pick is usually only possible 
for one or two branches. And it's not just simple conflicts that are 
easy to resolve, but it often even implies to do a full new fix, if the 
corresponding code was refactored, what is more often the case than one 
might think.


If there is no very strong ask from the community, I would rather let 
committer spent their time reviewing PRs instead and help contributors 
to get the work merged.


Just my 2ct.

-Matthias


On 4/13/23 2:52 PM, Ismael Juma wrote:

Clarification below.

I did not understand your point about maintenance expense to ensure

compatibility. I am confused because, IMO, irrespective of our bug fix
support duration for minor versions, we should ensure that all prior minor
versions are compatible. Hence, increasing the support duration to 24
months will not add more expense than today to ensure compatibility.



No, I am not saying that. I am saying that there is no reason not to
upgrade from one minor release to another since we provide full
compatibility between minor releases. The expensive part is that we release
3 times a year, so you have to support 6 releases at any given point in
time. More importantly, you have to validate all these releases, handle any
additional bugs and so on. When it comes to the CVE stuff, you also have to
deal with cases where a project you depend on forces an upgrade to a
release with compatibility impact and so on. Having seen this first hand,
it's a significant amount of work.

Ismael



Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #1779

2023-04-19 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 467630 lines...]
[2023-04-19T23:22:42.565Z] 
[2023-04-19T23:22:42.565Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 139 > 
org.apache.kafka.connect.integration.ExactlyOnceSourceIntegrationTest > 
testConnectorBoundary PASSED
[2023-04-19T23:22:42.565Z] 
[2023-04-19T23:22:42.565Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 139 > 
org.apache.kafka.connect.integration.ExactlyOnceSourceIntegrationTest > 
testFencedLeaderRecovery STARTED
[2023-04-19T23:22:45.286Z] 
[2023-04-19T23:22:45.286Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 139 > 
org.apache.kafka.connect.integration.ExactlyOnceSourceIntegrationTest > 
testFencedLeaderRecovery PASSED
[2023-04-19T23:22:45.286Z] 
[2023-04-19T23:22:45.286Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 139 > 
org.apache.kafka.connect.integration.ExactlyOnceSourceIntegrationTest > 
testPollBoundary STARTED
[2023-04-19T23:22:53.285Z] 
[2023-04-19T23:22:53.285Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 139 > 
org.apache.kafka.connect.integration.ExactlyOnceSourceIntegrationTest > 
testPollBoundary PASSED
[2023-04-19T23:22:53.285Z] 
[2023-04-19T23:22:53.285Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 139 > 
org.apache.kafka.connect.integration.ExactlyOnceSourceIntegrationTest > 
testPreflightValidation STARTED
[2023-04-19T23:22:57.144Z] 
[2023-04-19T23:22:57.144Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 139 > 
org.apache.kafka.connect.integration.ExactlyOnceSourceIntegrationTest > 
testPreflightValidation PASSED
[2023-04-19T23:22:57.144Z] 
[2023-04-19T23:22:57.144Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 139 > 
org.apache.kafka.connect.integration.InternalTopicsIntegrationTest > 
testFailToStartWhenInternalTopicsAreNotCompacted STARTED
[2023-04-19T23:23:05.218Z] 
[2023-04-19T23:23:05.218Z] > Task :connect:mirror:integrationTest
[2023-04-19T23:23:05.218Z] 
[2023-04-19T23:23:05.218Z] Gradle Test Run :connect:mirror:integrationTest > 
Gradle Test Executor 134 > MirrorConnectorsIntegrationSSLTest > 
testSyncTopicConfigs() PASSED
[2023-04-19T23:23:05.218Z] 
[2023-04-19T23:23:05.218Z] Gradle Test Run :connect:mirror:integrationTest > 
Gradle Test Executor 134 > MirrorConnectorsIntegrationSSLTest > 
testRestartReplication() STARTED
[2023-04-19T23:23:26.973Z] 
[2023-04-19T23:23:26.973Z] > Task :connect:runtime:integrationTest
[2023-04-19T23:23:26.973Z] 
[2023-04-19T23:23:26.973Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 139 > 
org.apache.kafka.connect.integration.InternalTopicsIntegrationTest > 
testFailToStartWhenInternalTopicsAreNotCompacted PASSED
[2023-04-19T23:23:26.973Z] 
[2023-04-19T23:23:26.973Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 139 > 
org.apache.kafka.connect.integration.InternalTopicsIntegrationTest > 
testCreateInternalTopicsWithFewerReplicasThanBrokers STARTED
[2023-04-19T23:23:26.973Z] 
[2023-04-19T23:23:26.973Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 139 > 
org.apache.kafka.connect.integration.InternalTopicsIntegrationTest > 
testCreateInternalTopicsWithFewerReplicasThanBrokers PASSED
[2023-04-19T23:23:26.973Z] 
[2023-04-19T23:23:26.973Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 139 > 
org.apache.kafka.connect.integration.InternalTopicsIntegrationTest > 
testCreateInternalTopicsWithDefaultSettings STARTED
[2023-04-19T23:23:40.114Z] 
[2023-04-19T23:23:40.114Z] > Task :connect:mirror:integrationTest
[2023-04-19T23:23:40.114Z] 
[2023-04-19T23:23:40.114Z] Gradle Test Run :connect:mirror:integrationTest > 
Gradle Test Executor 134 > MirrorConnectorsIntegrationSSLTest > 
testRestartReplication() PASSED
[2023-04-19T23:23:40.114Z] 
[2023-04-19T23:23:40.114Z] Gradle Test Run :connect:mirror:integrationTest > 
Gradle Test Executor 134 > MirrorConnectorsIntegrationSSLTest > 
testOneWayReplicationWithFrequentOffsetSyncs() STARTED
[2023-04-19T23:23:58.245Z] 
[2023-04-19T23:23:58.245Z] > Task :connect:runtime:integrationTest
[2023-04-19T23:23:58.245Z] 
[2023-04-19T23:23:58.245Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 139 > 
org.apache.kafka.connect.integration.InternalTopicsIntegrationTest > 
testCreateInternalTopicsWithDefaultSettings PASSED
[2023-04-19T23:23:58.245Z] 
[2023-04-19T23:23:58.245Z] Gradle Test Run :connect:runtime:integrationTest > 
Gradle Test Executor 139 > 
org.apache.kafka.connect.integration.InternalTopicsIntegrationTest > 
testStartWhenInternalTopicsCreatedManuallyWithCompactForBrokersDefaultCleanupPolicy
 STARTED
[2023-04-19T23:23:58.245Z] 
[2023-04-19T23:23:58.246Z] Gradle Test Run 

Re: KIP-919: Allow AdminClient to Talk Directly with the KRaft Controller Quorum

2023-04-19 Thread Colin McCabe
On Wed, Apr 19, 2023, at 14:37, Ron Dagostino wrote:
> Thanks for the KIP, Colin.
>
> There seems to be some inconsistency between sometimes referring to
> "TargetKRaftControllerQuorum" and other times referring to
> "DirectToKRaftControllerQuorum".  Aside from that, it looks good to
> me.  The symmetry of bootstrap servers and bootstrap controllers feels
> right.
>

Hi Ron,

Good point. I will replace all of them with DirectToKRaftControllerQuorum.

> What happens in the combined case?  Will it depend on the listener
> that gets the request?  So if you put the port for a controller
> listener in bootstrap servers and that listener gets the request then
> it fails, and vice-versa if you put the port for a non-controller
> listener in bootstrap controllers and that listener gets the request
> it also fails?

Correct. Controller listeners are always distinct from broker listeners.

best,
Colin

>
> Ron
>
> On Wed, Apr 19, 2023 at 5:18 PM Colin McCabe  wrote:
>>
>> Hi all,
>>
>> I wrote a short KIP about allowing AdminClient to talk directly with the 
>> KRaft controller quorum. Check it out here:
>>
>> https://cwiki.apache.org/confluence/x/Owo0Dw
>>
>> best,
>> Colin


Re: KIP-919: Allow AdminClient to Talk Directly with the KRaft Controller Quorum

2023-04-19 Thread Ron Dagostino
Thanks for the KIP, Colin.

There seems to be some inconsistency between sometimes referring to
"TargetKRaftControllerQuorum" and other times referring to
"DirectToKRaftControllerQuorum".  Aside from that, it looks good to
me.  The symmetry of bootstrap servers and bootstrap controllers feels
right.

What happens in the combined case?  Will it depend on the listener
that gets the request?  So if you put the port for a controller
listener in bootstrap servers and that listener gets the request then
it fails, and vice-versa if you put the port for a non-controller
listener in bootstrap controllers and that listener gets the request
it also fails?

Ron

On Wed, Apr 19, 2023 at 5:18 PM Colin McCabe  wrote:
>
> Hi all,
>
> I wrote a short KIP about allowing AdminClient to talk directly with the 
> KRaft controller quorum. Check it out here:
>
> https://cwiki.apache.org/confluence/x/Owo0Dw
>
> best,
> Colin


KIP-919: Allow AdminClient to Talk Directly with the KRaft Controller Quorum

2023-04-19 Thread Colin McCabe
Hi all,

I wrote a short KIP about allowing AdminClient to talk directly with the KRaft 
controller quorum. Check it out here:

https://cwiki.apache.org/confluence/x/Owo0Dw

best,
Colin


[jira] [Resolved] (KAFKA-4327) Move Reset Tool from core to streams

2023-04-19 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-4327.

Fix Version/s: (was: 4.0.0)
   Resolution: Fixed

This was resolved via https://issues.apache.org/jira/browse/KAFKA-14586.

> Move Reset Tool from core to streams
> 
>
> Key: KAFKA-4327
> URL: https://issues.apache.org/jira/browse/KAFKA-4327
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Blocker
>  Labels: kip
>
> This is a follow up on https://issues.apache.org/jira/browse/KAFKA-4008
> Currently, Kafka Streams Application Reset Tool is part of {{core}} module 
> due to ZK dependency. After KIP-4 got merged, this dependency can be dropped 
> and the Reset Tool can be moved to {{streams}} module.
> This should also update {{InternalTopicManager#filterExistingTopics}} that 
> revers to ResetTool in an exception message:
>  {{"Use 'kafka.tools.StreamsResetter' tool"}}
>  -> {{"Use '" + kafka.tools.StreamsResetter.getClass().getName() + "' tool"}}
> Doing this JIRA also requires to update the docs with regard to broker 
> backward compatibility – not all broker support "topic delete request" and 
> thus, the reset tool will not be backward compatible to all broker versions.
> KIP-756: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-756%3A+Move+StreamsResetter+tool+outside+of+core]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-892: Transactional Semantics for StateStores

2023-04-19 Thread Nick Telford
Hi Colt,

The issue is that if there's a crash between 2 and 3, then you still end up
with inconsistent data in RocksDB. The only way to guarantee that your
checkpoint offsets and locally stored data are consistent with each other
are to atomically commit them, which can be achieved by having the offsets
stored in RocksDB.

The offsets column family is likely to be extremely small (one
per-changelog partition + one per Topology input partition for regular
stores, one per input partition for global stores). So the overhead will be
minimal.

A major benefit of doing this is that we can remove the explicit calls to
db.flush(), which forcibly flushes memtables to disk on-commit. It turns
out, RocksDB memtable flushes are largely dictated by Kafka Streams
commits, *not* RocksDB configuration, which could be a major source of
confusion. Atomic checkpointing makes it safe to remove these explicit
flushes, because it no longer matters exactly when RocksDB flushes data to
disk; since the data and corresponding checkpoint offsets will always be
flushed together, the local store is always in a consistent state, and
on-restart, it can always safely resume restoration from the on-disk
offsets, restoring the small amount of data that hadn't been flushed when
the app exited/crashed.

Regards,
Nick

On Wed, 19 Apr 2023 at 14:35, Colt McNealy  wrote:

> Nick,
>
> Thanks for your reply. Ack to A) and B).
>
> For item C), I see what you're referring to. Your proposed solution will
> work, so no need to change it. What I was suggesting was that it might be
> possible to achieve this with only one column family. So long as:
>
>- No uncommitted records (i.e. not committed to the changelog) are
>*committed* to the state store, AND
>- The Checkpoint offset (which refers to the changelog topic) is less
>than or equal to the last written changelog offset in rocksdb
>
> I don't see the need to do the full restoration from scratch. My
> understanding was that prior to 844/892, full restorations were required
> because there could be uncommitted records written to RocksDB; however,
> given your use of RocksDB transactions, that can be avoided with the
> pattern of 1) commit Kafka transaction, 2) commit RocksDB transaction, 3)
> update offset in checkpoint file.
>
> Anyways, your proposed solution works equivalently and I don't believe
> there is much overhead to an additional column family in RocksDB. Perhaps
> it may even perform better than making separate writes to the checkpoint
> file.
>
> Colt McNealy
> *Founder, LittleHorse.io*
>
>
> On Wed, Apr 19, 2023 at 5:53 AM Nick Telford 
> wrote:
>
> > Hi Colt,
> >
> > A. I've done my best to de-couple the StateStore stuff from the rest of
> the
> > Streams engine. The fact that there will be only one ongoing (write)
> > transaction at a time is not guaranteed by any API, and is just a
> > consequence of the way Streams operates. To that end, I tried to ensure
> the
> > documentation and guarantees provided by the new APIs are independent of
> > this incidental behaviour. In practice, you're right, this essentially
> > refers to "interactive queries", which are technically "read
> transactions",
> > even if they don't actually use the transaction API to isolate
> themselves.
> >
> > B. Yes, although not ideal. This is for backwards compatibility, because:
> > 1) Existing custom StateStore implementations will implement flush(),
> > and not commit(), but the Streams engine now calls commit(), so those
> calls
> > need to be forwarded to flush() for these legacy stores.
> > 2) Existing StateStore *users*, i.e. outside of the Streams engine
> > itself, may depend on explicitly calling flush(), so for these cases,
> > flush() needs to be redirected to call commit().
> > If anyone has a better way to guarantee compatibility without introducing
> > this potential recursion loop, I'm open to changes!
> >
> > C. This is described in the "Atomic Checkpointing" section. Offsets are
> > stored in a separate RocksDB column family, which is guaranteed to be
> > atomically flushed to disk with all other column families. The issue of
> > checkpoints being written to disk after commit causing inconsistency if
> it
> > crashes in between is the reason why, under EOS, checkpoint files are
> only
> > written on clean shutdown. This is one of the major causes of "full
> > restorations", so moving the offsets into a place where they can be
> > guaranteed to be atomically written with the data they checkpoint allows
> us
> > to write the checkpoint offsets *on every commit*, not just on clean
> > shutdown.
> >
> > Regards,
> > Nick
> >
> > On Tue, 18 Apr 2023 at 15:39, Colt McNealy  wrote:
> >
> > > Nick,
> > >
> > > Thank you for continuing this work. I have a few minor clarifying
> > > questions.
> > >
> > > A) "Records written to any transaction are visible to all other
> > > transactions immediately." I am confused here—I thought there could
> only
> > be
> > > one transaction 

[jira] [Created] (KAFKA-14922) kafka-streams-application-reset deletes topics not belonging to specified application-id

2023-04-19 Thread Jira
Jørgen created KAFKA-14922:
--

 Summary: kafka-streams-application-reset deletes topics not 
belonging to specified application-id
 Key: KAFKA-14922
 URL: https://issues.apache.org/jira/browse/KAFKA-14922
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.4.0
Reporter: Jørgen


Slack-thread: 
[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1681908267206849]

When running the command _kafka-streams-application-reset --bootstrap-servers 
$BOOTSTRAP --application-id foo_ all changelog and repartition topics that 
_starts with_ foo is deleted. This happens even if there's no application-id 
named foo.

Example:
{code:java}
Application IDs:
foo-v1
foo-v2

Internal topics:
foo-v1-repartition-topic-repartition
foo-v2-repartition-topic-repartition 

Application reset:
kafka-streams-application-reset --bootstrap-servers $BOOTSTRAP --application-id 
foo
> No input or intermediate topics specified. Skipping seek.
Deleting inferred internal topics [foo-v2-repartition-topic-repartition, 
foo-v1-repartition-topic-repartition]
Done.{code}
Expected behaviour is that the command fails as there are no application-id's 
with the name foo instead of deleting all foo* topics. 

This is critical on typos or if application-ids starts with the same name as 
others.

The bug should be located here: 
[https://github.com/apache/kafka/blob/c14f56b48461f01743146d58987bc8661ba0d459/tools/src/main/java/org/apache/kafka/tools/StreamsResetter.java#L693]

Should check that the topics matches the application-id exactly instead of 
checking that it starts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Adding reviewers with Github actions

2023-04-19 Thread Ismael Juma
It's a lot more convenient to have it in the commit than having to follow
links, etc.

David Arthur also wrote a script to help with this step, I believe.

Ismael

On Tue, Apr 18, 2023, 9:29 AM Divij Vaidya  wrote:

> Do we even need a manual attribution for a reviewer in the commit message?
> GitHub automatically marks the folks as "reviewers" who have used the
> "review-changes" button on the top left corner and left feedback. GitHub
> also has searchability for such reviews done by a particular person using
> the following link:
>
> https://github.com/search?q=is%3Apr+reviewed-by%3A
> +repo%3Aapache%2Fkafka+repo%3Aapache%2Fkafka-site=issues
>
> (replace  with the GitHub username)
>
> --
> Divij Vaidya
>
>
>
> On Tue, Apr 18, 2023 at 4:09 PM Viktor Somogyi-Vass
>  wrote:
>
> > I'm not that familiar with Actions either, it just seemed like a tool for
> > this purpose. :)
> > I Did some digging and what I have in mind is that on pull request review
> > it can trigger a workflow:
> >
> >
> https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request_review
> >
> > We could in theory use Github CLI to edit the description of the PR when
> > someone gives a review (or we could perhaps enable this to simply comment
> > too):
> >
> >
> https://docs.github.com/en/actions/using-workflows/using-github-cli-in-workflows
> >
> > So the action definition would look something like this below. Note that
> > the "run" part is very basic, it's just here for the idea. We'll probably
> > need a shell script instead of that line to format it better. But the
> point
> > is that it edits the PR and adds the reviewer:
> >
> > name: Add revieweron:
> >   issues:
> > types:
> >   - pull_request_reviewjobs:
> >   comment:
> > runs-on: ubuntu-latest
> > steps:  - run: gh pr edit $PR_ID --title "$PR_TITLE" --body
> > "$PR_BODY\n\nReviewers: $SENDER"
> > env:
> >   GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
> >   PR_ID: ${{ github.event.pull_request.id }}
> >   PR_TITLE: ${{ github.event.pull_request.title }}
> >   PR_BODY: ${{ github.event.pull_request.body }}
> >   SENDER: ${{ github.event.sender }}
> >
> > I'll take a look if I can try this out one my fork and get back if it
> leads
> > to anything.
> >
> > Viktor
> >
> > On Tue, Apr 18, 2023 at 10:12 AM Josep Prat  >
> > wrote:
> >
> > > Hi all,
> > > Unless I miss something, wouldn't this GitHub action either amend the
> > > commit (breaking signature if any) or directly do the commit itself
> > > (meaning the action would be the one squashing and merging and not the
> > > maintainer anymore)?
> > >
> > > Let me know if I'm missing something or if there are some nice hidden
> > > tricks in GitHub that I didn't know :)
> > >
> > > Best,
> > > On Tue, Apr 18, 2023 at 9:48 AM Viktor Somogyi-Vass
> > >  wrote:
> > >
> > > > Hi all,
> > > >
> > > > Unfortunately I forgot to add myself as a reviewer *again *on a PR
> when
> > > > merging. Shame on me.
> > > > However I was thinking about looking into Github actions whether we
> can
> > > > automate this process or at least prevent PRs from merging that don't
> > > have
> > > > "reviewers" in the description.
> > > >
> > > > Has anyone ever looked at it, is it worth chasing this or does anyone
> > > know
> > > > anything that'd prevent us from using it?
> > > >
> > > > Viktor
> > > >
> > >
> > >
> > > --
> > > [image: Aiven] 
> > >
> > > *Josep Prat*
> > > Open Source Engineering Director, *Aiven*
> > > josep.p...@aiven.io   |   +491715557497
> > > aiven.io    |   <
> > https://www.facebook.com/aivencloud
> > > >
> > >      <
> > > https://twitter.com/aiven_io>
> > > *Aiven Deutschland GmbH*
> > > Alexanderufer 3-7, 10117 Berlin
> > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > > Amtsgericht Charlottenburg, HRB 209739 B
> > >
> >
>


Re: [DISCUSS] KIP-892: Transactional Semantics for StateStores

2023-04-19 Thread Colt McNealy
Nick,

Thanks for your reply. Ack to A) and B).

For item C), I see what you're referring to. Your proposed solution will
work, so no need to change it. What I was suggesting was that it might be
possible to achieve this with only one column family. So long as:

   - No uncommitted records (i.e. not committed to the changelog) are
   *committed* to the state store, AND
   - The Checkpoint offset (which refers to the changelog topic) is less
   than or equal to the last written changelog offset in rocksdb

I don't see the need to do the full restoration from scratch. My
understanding was that prior to 844/892, full restorations were required
because there could be uncommitted records written to RocksDB; however,
given your use of RocksDB transactions, that can be avoided with the
pattern of 1) commit Kafka transaction, 2) commit RocksDB transaction, 3)
update offset in checkpoint file.

Anyways, your proposed solution works equivalently and I don't believe
there is much overhead to an additional column family in RocksDB. Perhaps
it may even perform better than making separate writes to the checkpoint
file.

Colt McNealy
*Founder, LittleHorse.io*


On Wed, Apr 19, 2023 at 5:53 AM Nick Telford  wrote:

> Hi Colt,
>
> A. I've done my best to de-couple the StateStore stuff from the rest of the
> Streams engine. The fact that there will be only one ongoing (write)
> transaction at a time is not guaranteed by any API, and is just a
> consequence of the way Streams operates. To that end, I tried to ensure the
> documentation and guarantees provided by the new APIs are independent of
> this incidental behaviour. In practice, you're right, this essentially
> refers to "interactive queries", which are technically "read transactions",
> even if they don't actually use the transaction API to isolate themselves.
>
> B. Yes, although not ideal. This is for backwards compatibility, because:
> 1) Existing custom StateStore implementations will implement flush(),
> and not commit(), but the Streams engine now calls commit(), so those calls
> need to be forwarded to flush() for these legacy stores.
> 2) Existing StateStore *users*, i.e. outside of the Streams engine
> itself, may depend on explicitly calling flush(), so for these cases,
> flush() needs to be redirected to call commit().
> If anyone has a better way to guarantee compatibility without introducing
> this potential recursion loop, I'm open to changes!
>
> C. This is described in the "Atomic Checkpointing" section. Offsets are
> stored in a separate RocksDB column family, which is guaranteed to be
> atomically flushed to disk with all other column families. The issue of
> checkpoints being written to disk after commit causing inconsistency if it
> crashes in between is the reason why, under EOS, checkpoint files are only
> written on clean shutdown. This is one of the major causes of "full
> restorations", so moving the offsets into a place where they can be
> guaranteed to be atomically written with the data they checkpoint allows us
> to write the checkpoint offsets *on every commit*, not just on clean
> shutdown.
>
> Regards,
> Nick
>
> On Tue, 18 Apr 2023 at 15:39, Colt McNealy  wrote:
>
> > Nick,
> >
> > Thank you for continuing this work. I have a few minor clarifying
> > questions.
> >
> > A) "Records written to any transaction are visible to all other
> > transactions immediately." I am confused here—I thought there could only
> be
> > one transaction going on at a time for a given state store given the
> > threading model for processing records on a Task. Do you mean Interactive
> > Queries by "other transactions"? (If so, then everything makes sense—I
> > thought that since IQ were read-only then they didn't count as
> > transactions).
> >
> > B) Is it intentional that the default implementations of the flush() and
> > commit() methods in the StateStore class refer to each other in some sort
> > of unbounded recursion?
> >
> > C) How will the getCommittedOffset() method work? At first I thought the
> > way to do it would be using a special key in the RocksDB store to store
> the
> > offset, and committing that with the transaction. But upon second
> thought,
> > since restoration from the changelog is an idempotent procedure, I think
> it
> > would be fine to 1) commit the RocksDB transaction and then 2) write the
> > offset to disk in a checkpoint file. If there is a crash between 1) and
> 2),
> > I think the only downside is now we replay a few more records (at a cost
> of
> > <100ms). Am I missing something there?
> >
> > Other than that, everything makes sense to me.
> >
> > Cheers,
> > Colt McNealy
> > *Founder, LittleHorse.io*
> >
> >
> > On Tue, Apr 18, 2023 at 3:59 AM Nick Telford 
> > wrote:
> >
> > > Hi everyone,
> > >
> > > I've updated the KIP to reflect the latest version of the design:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-892%3A+Transactional+Semantics+for+StateStores
> > >
> > > There are 

Re: [DISCUSS] KIP-892: Transactional Semantics for StateStores

2023-04-19 Thread Nick Telford
Hi Colt,

A. I've done my best to de-couple the StateStore stuff from the rest of the
Streams engine. The fact that there will be only one ongoing (write)
transaction at a time is not guaranteed by any API, and is just a
consequence of the way Streams operates. To that end, I tried to ensure the
documentation and guarantees provided by the new APIs are independent of
this incidental behaviour. In practice, you're right, this essentially
refers to "interactive queries", which are technically "read transactions",
even if they don't actually use the transaction API to isolate themselves.

B. Yes, although not ideal. This is for backwards compatibility, because:
1) Existing custom StateStore implementations will implement flush(),
and not commit(), but the Streams engine now calls commit(), so those calls
need to be forwarded to flush() for these legacy stores.
2) Existing StateStore *users*, i.e. outside of the Streams engine
itself, may depend on explicitly calling flush(), so for these cases,
flush() needs to be redirected to call commit().
If anyone has a better way to guarantee compatibility without introducing
this potential recursion loop, I'm open to changes!

C. This is described in the "Atomic Checkpointing" section. Offsets are
stored in a separate RocksDB column family, which is guaranteed to be
atomically flushed to disk with all other column families. The issue of
checkpoints being written to disk after commit causing inconsistency if it
crashes in between is the reason why, under EOS, checkpoint files are only
written on clean shutdown. This is one of the major causes of "full
restorations", so moving the offsets into a place where they can be
guaranteed to be atomically written with the data they checkpoint allows us
to write the checkpoint offsets *on every commit*, not just on clean
shutdown.

Regards,
Nick

On Tue, 18 Apr 2023 at 15:39, Colt McNealy  wrote:

> Nick,
>
> Thank you for continuing this work. I have a few minor clarifying
> questions.
>
> A) "Records written to any transaction are visible to all other
> transactions immediately." I am confused here—I thought there could only be
> one transaction going on at a time for a given state store given the
> threading model for processing records on a Task. Do you mean Interactive
> Queries by "other transactions"? (If so, then everything makes sense—I
> thought that since IQ were read-only then they didn't count as
> transactions).
>
> B) Is it intentional that the default implementations of the flush() and
> commit() methods in the StateStore class refer to each other in some sort
> of unbounded recursion?
>
> C) How will the getCommittedOffset() method work? At first I thought the
> way to do it would be using a special key in the RocksDB store to store the
> offset, and committing that with the transaction. But upon second thought,
> since restoration from the changelog is an idempotent procedure, I think it
> would be fine to 1) commit the RocksDB transaction and then 2) write the
> offset to disk in a checkpoint file. If there is a crash between 1) and 2),
> I think the only downside is now we replay a few more records (at a cost of
> <100ms). Am I missing something there?
>
> Other than that, everything makes sense to me.
>
> Cheers,
> Colt McNealy
> *Founder, LittleHorse.io*
>
>
> On Tue, Apr 18, 2023 at 3:59 AM Nick Telford 
> wrote:
>
> > Hi everyone,
> >
> > I've updated the KIP to reflect the latest version of the design:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-892%3A+Transactional+Semantics+for+StateStores
> >
> > There are several changes in there that reflect feedback from this
> thread,
> > and there's a new section and a bunch of interface changes relating to
> > Atomic Checkpointing, which is the final piece of the puzzle to making
> > everything robust.
> >
> > Let me know what you think!
> >
> > Regards,
> > Nick
> >
> > On Tue, 3 Jan 2023 at 11:33, Nick Telford 
> wrote:
> >
> > > Hi Lucas,
> > >
> > > Thanks for looking over my KIP.
> > >
> > > A) The bound is per-instance, not per-Task. This was a typo in the KIP
> > > that I've now corrected. It was originally per-Task, but I changed it
> to
> > > per-instance for exactly the reason you highlighted.
> > > B) It's worth noting that transactionality is only enabled under EOS,
> and
> > > in the default mode of operation (ALOS), there should be no change in
> > > behavior at all. I think, under EOS, we can mitigate the impact on
> users
> > by
> > > sufficiently low default values for the memory bound configuration. I
> > > understand your hesitation to include a significant change of
> behaviour,
> > > especially in a minor release, but I suspect that most users will
> prefer
> > > the memory impact (under EOS) to the existing behaviour of frequent
> state
> > > restorations! If this is a problem, the changes can wait until the next
> > > major release. I'll be running a patched version of streams in
> production
> > > with these 

KafkaPrincipal, oauth, & JWT

2023-04-19 Thread Neil Buesing
I am exploring how to get roles defined via oauth authentication to be
passed with the KafkaPrincipal (generated by the
DefaultKafkaPrincipalBuilder) so it can be used by authorization.

I know the PrincipalBuilder can be replaced with a custom implementation
along with an alternative KafkaPrincipal implementation, but I was hoping
with the standardization of OAUTH within Kafka for handling the JWT.

In searching email archives and KIPs I do not see anything about this;
curious if there are any thoughts on this? The downside, I do not see how
to leverage JWT attributes in a generic way, so a custom Authorizer would
still be necessary.

Thanks,

Neil


Re: [DISCUSS] KIP-918: MM2 Topic And Group Listener

2023-04-19 Thread Dániel Urbán
I wouldn't really include a non-existent group (same as we don't care about
a non-existent topic), that doesn't really matter.
I think having an existing group which doesn't have an offset to checkpoint
is equivalent to a topic having no records to replicate from the monitoring
perspective.

I think the precise way to put it is to monitor the topics and groups
picked up by the filtering logic of MM2. "The list currently replicated" is
not a good definition, as an empty topic would still be interesting for
monitoring purposes, even if there is no message to replicate.
I think the core motivation is to capture the output of the
TopicFilter/GroupFilter + the extra, built-in logic of MM2 related to
filtering (e.g. internal topics are never replicated, the heartbeats topics
are always replicated, and so on). This logic is too complex to reproduce
in an external monitoring system, as it would need to use the exact same
TopicFilter/GroupFilter configs as MM2 is using, and then implement the
additional built-in logic of MM2 to finally get the topics and groups
picked up by the replication.

I think this would be useful in any replication setups (finding the
effective list of filtered topics and groups), but especially useful when
using the IdentityReplicationPolicy. One gap related to the identity policy
is that we cannot find the replica topics of a specific flow, even when
using MirrorClient, or having access to the source and target Kafka
clusters, as the "traditional" way of finding replica topics is based on
topic naming and the ReplicationPolicy.

Thanks,
Daniel

hudeqi <16120...@bjtu.edu.cn> ezt írta (időpont: 2023. ápr. 19., Sze,
10:58):

> Thanks for your reply, Daniel.
> Regarding the group list, do you mean that if the group of the source
> cluster has not committed an offset (the group does not exist or the group
> has not committed an offset to the topic being replicated), then the
> current metric cannot be collected? Then this involves the question of
> motivation: Do we want to monitor the topic list and group list we
> configured, or the topic list and group list that are currently being
> replicated? If it is the latter, shouldn't it be detected for a group that
> has not committed an offset? I don't know if I understand correctly.
>
> best,
> hudeqi
>
>
>  -原始邮件-
>  发件人: "Dániel Urbán" 
>  发送时间: 2023-04-19 15:50:01 (星期三)
>  收件人: dev@kafka.apache.org
>  抄送:
>  主题: Re: Re: [DISCUSS] KIP-918: MM2 Topic And Group Listener
> 
> 


[jira] [Created] (KAFKA-14921) Avoid non numeric values for metrics

2023-04-19 Thread Mickael Maison (Jira)
Mickael Maison created KAFKA-14921:
--

 Summary: Avoid non numeric values for metrics
 Key: KAFKA-14921
 URL: https://issues.apache.org/jira/browse/KAFKA-14921
 Project: Kafka
  Issue Type: Improvement
Reporter: Mickael Maison
Assignee: Mickael Maison


Many monitoring tools such as prometheus and graphite only support numeric 
values. This makes it hard to collect and monitor non numeric metrics.

We should avoid using Gauges with arbitrary types and provide numeric 
alternatives to such existing metrics.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-918: MM2 Topic And Group Listener

2023-04-19 Thread hudeqi
Thanks for your reply, Daniel.
Regarding the group list, do you mean that if the group of the source cluster 
has not committed an offset (the group does not exist or the group has not 
committed an offset to the topic being replicated), then the current metric 
cannot be collected? Then this involves the question of motivation: Do we want 
to monitor the topic list and group list we configured, or the topic list and 
group list that are currently being replicated? If it is the latter, shouldn't 
it be detected for a group that has not committed an offset? I don't know if I 
understand correctly.

best,
hudeqi


 -原始邮件-
 发件人: "Dániel Urbán" 
 发送时间: 2023-04-19 15:50:01 (星期三)
 收件人: dev@kafka.apache.org
 抄送: 
 主题: Re: Re: [DISCUSS] KIP-918: MM2 Topic And Group Listener
 


Re: Re: [DISCUSS] KIP-918: MM2 Topic And Group Listener

2023-04-19 Thread Dániel Urbán
Hi hudeqi,

Thank you for your comments!
Related to the topic list: you are correct, the partition metrics are
created eagerly, so even if a topic has no active traffic, the metrics are
visible. I missed this fact when creating the KIP.
For the group list: the group metrics are created lazily, so for the
groups, the metrics are only created if there was a checkpoint replicated
(i.e. the group is included in the replication AND has committed offset for
at least one of the replicated partitions).

Regardless of the eager/lazy nature of these metrics, I think that having a
centralized way of getting the list of replicated topics/groups still makes
sense, and it can simplify access to this information for external clients.
E.g. We could add a listener implementation which writes the list of
topics/groups into a compact topic, which then can be consumed by external
clients, which is simpler than scraping partition metrics and collecting
the topic tags.

I will add corrections to the Rejected Alternatives section with more
details about the metrics.

Thanks,
Daniel

hudeqi <16120...@bjtu.edu.cn> ezt írta (időpont: 2023. ápr. 19., Sze, 4:22):

> Hi, I have some questions about motivation.
> The problem to be solved by this kip is that it cannot accurately monitor
> the topic and group currently being replicated? From my point of view,
> using MirrorSourceMetrics.recordRate can monitor the configured topic list
> that is currently being replicated (even if the topic has no data), and
> using MirrorCheckpointMetrics.CHECKPOINT_LATENCY can monitor the currently
> replicated group list (if it is wrong, please correct me).
>
> best,
> hudeqi
>
> Dániel Urbán urb.dani...@gmail.com写道:
> > Hello everyone,
> >
> > I would like to bump this KIP. Please consider reviewing it, as it would
> > improve the monitoring capabilities around MM2.
> > I also submitted a PR (https://github.com/apache/kafka/pull/13595) to
> > demonstrate the current state of the KIP.
> >
> > Thanks in advance,
> > Daniel
> >
> > Dániel Urbán  ezt írta (időpont: 2023. ápr. 13.,
> Cs,
> > 15:02):
> >
> > > Hi everyone,
> > >
> > > I would like to start a discussion on KIP-918: MM2 Topic And Group
> > > Listener (
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-918%3A+MM2+Topic+And+Group+Listener
> > > ).
> > > This new feature of MM2 would allow following the latest set of
> replicated
> > > topics and groups, which is currently not possible in MM2.
> Additionally,
> > > this would help IdentityReplicationPolicy users, as they could use
> this new
> > > feature to track the replicated topics (which is not available through
> the
> > > policy due to topics not being renamed during replication).
> > >
> > > Thanks in advance,
> > > Daniel
> > >
>