Hi Tina, Thanks for the updates!
RE item 1: Ah, to be clear, I wasn't suggesting that we retry with the incremental API for every request, just wondering about whether we'd want to block on the very first request completing before we issue subsequent requests, since otherwise we might send multiple requests with an unsupported API to the broker. I don't necessarily foresee any issues if this does happen, but it seemed worth calling that detail out in case there are implications from it that may catch others' eyes. Anyways, no further questions on this point--I think it's fine to potentially issue those subsequent requests (unless anyone else finds something wrong with them), and I also think it's fine to require users to restart/reconfigure MM2 in order for it to go back to using the incremental API in the event that it was originally configured with "use.incremental.alter.configs" set to "requested" and then had to fall back on the legacy API due to incompatibilities with the broker. RE item 2: > If the user deletes a config in the source cluster, it would be reset to the default value but the config source does not change back to DEFAULT_CONFIG. Is that correct? I was under the impression that the source does get reset, and local tests using the kafka-configs.sh script and the Java Admin client both seem to confirm this (I verified by using the Admin::describeConfigs method [1], which is what MM2 currently uses to scan for configs from topics on the source cluster [2]). Based on the above, I think this part is inaccurate: > Basically, if a user deletes a config for the source topic, with either API, the config on the target topic is overwritten with the source cluster's default. I think the behavior when MM2 users the legacy API is actually that configs which are deleted from topics on the source cluster are reset to the target cluster's default on the target cluster. If MM2 uses the incremental API, we run the risk of leaving the property unchanged on the cluster, and not propagating the deletion--just as you'd imagined, initially. One option could be to act differently depending on which API we're using for the topic config alteration request, in order to preserve the same user-facing behavior: - If we're using the legacy API, keep the existing behavior, where we never propagate default properties from the source to the target (which effectively wipes them from the target, due to the all-or-nothing nature of the legacy API) - If we're using the incremental API, for any default topic config properties on the source cluster, manually wipe them from the target cluster, which also matches existing behavior It'd be nice if someone more familiar with these client/broker APIs could keep me honest; I may be misunderstanding some of the contracts here. However, if this is all valid, we could then add the "shouldReplicateSourceDefault" method to the ConfigPropertyFilter interface just as suggested, but with a default of false, instead of true. It might even be worth adding a property to the DefaultConfigPropertyFilter class [3] that allows users to tweak this either on a per-property, property regex, or all-or-nothing basis. One more thought--could it also make sense to inform the ConfigPropertyFilter interface about what kind of broker API we're using for the request? RE item 3: LGTM, thank you :) [1] - https://kafka.apache.org/34/javadoc/org/apache/kafka/clients/admin/Admin.html#describeConfigs(java.util.Collection) [2] - https://github.com/apache/kafka/blob/eee2bf9686db85514c474732874d14456ae96ebc/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorSourceConnector.java#L503 [3] - https://github.com/apache/kafka/blob/eee2bf9686db85514c474732874d14456ae96ebc/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/DefaultConfigPropertyFilter.java Cheers, Chris On Mon, Feb 13, 2023 at 3:16 AM Gantigmaa Selenge <gsele...@redhat.com> wrote: > Hi Chris and Luke, > > Thank you very much for your feedback. > > I have addressed some of the suggestions and would like to clarify a few > things on the others: > > > 1) The current logic for syncing topic configs in MM2 is basically > fire-and-forget; all we do is log a warning message [1] if an attempt > fails. When "use.incremental.alter.configs" is set to "requested", we'll > need to know whether attempts using the incremental API succeed or not, and > then adjust our behavior accordingly. Will the plan here be to block on the > success/failure of the first request before sending any more, or will we > just switch over to the legacy API as soon as any request fails due to > targeting an incompatible broker, possibly after multiple requests with the > new API have been sent or at least queued up in the admin client? > > When it's set to "requested", I think the suggestion was to just switch > over to the legacy API as soon as any request fails due to > targeting an incompatible broker. We could keep using the legacy API until > the mirrormaker setting is changed by the user or the mirrormaker restarts > instead of trying the new API every time it syncs the configurations. I'm > not sure which approach is the best here. > > > 2) We don't replicate default properties from the source cluster right > now > [2]. > Won't making ConfigPropertyFilter::shouldReplicateSourceDefault return true > by default change that behavior? If so, what's the motivation for favoring > defaults from the source cluster over defaults for the target cluster, > especially given that users may already be relying on the opposite? > > The concern was around what happens if deleting a topic config in > the source cluster. I initially thought, because the legacy API resets all > the configs other than the ones being altered, it effectively propagates > the deletions of a config in the source cluster. I thought by migrating to > incrementalAlterConfig API, the deletion would no longer get propagated but > looking into it again, it may not be the case. > > If the user deletes a config in the source cluster, it would be reset to > the default value but the config source does not change back to > DEFAULT_CONFIG. For example: > > ./kafka-configs.sh --alter --entity-type topics --entity-name > quickstart-events --add-config retention.ms=720000 --bootstrap-server > localhost:9092 > > > /kafka-configs.sh --describe --entity-type topics --entity-name > quickstart-events --bootstrap-server localhost:9092 > > Dynamic configs for topic quickstart-events are: > > retention.ms=720000 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG: > retention.ms=720000} > > > ./kafka-configs.sh --alter --entity-type topics --entity-name > quickstart-events --delete-config retention.ms --bootstrap-server > localhost:9092 > > > /kafka-configs.sh --describe --entity-type topics --entity-name > quickstart-events --bootstrap-server localhost:9092 > > retention.ms=604800000 sensitive=false synonyms={} > > > Therefore, even with the legacy API, the deletion of the source topic does > not actually get propagated because isDefault() check here > < > https://github.com/apache/kafka/blob/6e2b86597d9cd7c8b2019cffb895522deb63c93a/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorSourceConnector.java#L460 > > > would no longer return true. Basically, if a user deletes a config for the > source topic, with either API, the config on the target topic is > overwritten with the source cluster's default. if I understood this > correctly, maybe we don't need to extend ConfigPropertyFilter as it would > still rely on DEFAULT_CONFIG config source and the KIP is not changing the > current behaviour? > > I have addressed the last comment from Chris about excluding code level > details from the proposed solution. > > I also have addressed Luke's comment about the compatibility section. > > Please let me know what you think. > > Regards, > Tina > > > On Tue, Feb 7, 2023 at 8:06 AM Luke Chen <show...@gmail.com> wrote: > > > Hi Tina, > > > > Thanks for the KIP to fix the issue. > > > > Some comments: > > 1. In the compatibility section, you said: > > `By default, the new setting will be set to false so it does not change > the > > current behaviour.` > > > > I'm confused, what is the config we need to set to `false` to avoid > > breaking compatibility? > > All I can see is there is one new config introduced: > > use.incremental.alter.configs, and default to "requested". > > Does that mean it'll change current behavior? > > If so, I think we should make it clear in the compatibility section about > > what will be changed after this KIP. > > > > 2. It looks like you're going to introduce a new method in the existing > > interface. Could you follow the pattern in other KIP (ex: KIP-888 > > < > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-888%3A+Batch+describe+ACLs+and+describe+client+quotas#KIP888:BatchdescribeACLsanddescribeclientquotas-NewAdminAPIs > > >) > > to list the javadoc and the method name together? > > > > Thank you. > > Luke > > > > > > On Fri, Feb 3, 2023 at 11:46 PM Chris Egerton <chr...@aiven.io.invalid> > > wrote: > > > > > Hi Tina, > > > > > > Thanks for the KIP! I recently ran into this exact issue and it's great > > to > > > see a fix being proposed. I have a few small comments but overall this > > > looks good: > > > > > > 1) The current logic for syncing topic configs in MM2 is basically > > > fire-and-forget; all we do is log a warning message [1] if an attempt > > > fails. When "use.incremental.alter.configs" is set to "requested", > we'll > > > need to know whether attempts using the incremental API succeed or not, > > and > > > then adjust our behavior accordingly. Will the plan here be to block on > > the > > > success/failure of the first request before sending any more, or will > we > > > just switch over to the legacy API as soon as any request fails due to > > > targeting an incompatible broker, possibly after multiple requests with > > the > > > new API have been sent or at least queued up in the admin client? > > > > > > 2) We don't replicate default properties from the source cluster right > > now > > > [2]. > > > Won't making ConfigPropertyFilter::shouldReplicateSourceDefault return > > true > > > by default change that behavior? If so, what's the motivation for > > favoring > > > defaults from the source cluster over defaults for the target cluster, > > > especially given that users may already be relying on the opposite? > > > > > > 3) Nit: IMO the parts in the "proposed changes" section that detail > > changes > > > to internal classes aren't really necessary since they're not relevant > to > > > user-facing behavior and the classes/methods described in them might > > change > > > between now and when the PR for the KIP gets opened/reviewed/merged. I > > > think the only points that need to be in the KIP are the ones beginning > > > with "Extend ConfigPropertyFilter class", "Add a new configuration > > setting > > > to MirrorMaker", and "From Kafka 4.0" (which itself can just describe > the > > > broker APIs that are used by MM2 in general, without referring to the > > > specific name of the method in MM2 that will call them). > > > > > > [1] - > > > > > > > > > https://github.com/apache/kafka/blob/6e2b86597d9cd7c8b2019cffb895522deb63c93a/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorSourceConnector.java#L429 > > > > > > [2] - > > > > > > > > > https://github.com/apache/kafka/blob/6e2b86597d9cd7c8b2019cffb895522deb63c93a/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorSourceConnector.java#L460 > > > > > > Thanks again for the KIP! > > > > > > Cheers, > > > > > > Chris > > > > > > On Wed, Jan 25, 2023 at 10:44 AM Gantigmaa Selenge < > gsele...@redhat.com> > > > wrote: > > > > > > > Hi, > > > > > > > > If there are no further comments on the KIP, I will start a vote on > it. > > > > > > > > Regards, > > > > > > > > > > > > On Mon, Jan 16, 2023 at 11:14 AM Gantigmaa Selenge < > > gsele...@redhat.com> > > > > wrote: > > > > > > > > > Thanks everyone. > > > > > > > > > > I took the suggestions and updated the KIP accordingly. Please let > me > > > > know > > > > > if there is anything else I could improve on. > > > > > > > > > > Regards, > > > > > Tina > > > > > > > > > > On Sun, Jan 15, 2023 at 10:24 PM Ismael Juma <ism...@juma.me.uk> > > > wrote: > > > > > > > > > >> Hi Tina, > > > > >> > > > > >> See below. > > > > >> > > > > >> On Wed, Jan 11, 2023 at 3:03 AM Gantigmaa Selenge < > > > gsele...@redhat.com> > > > > >> wrote: > > > > >> > > > > >> > I do like this idea, however when it's set to required, I wasn't > > > sure > > > > >> how > > > > >> > the mirrormaker should have. It's probably not a great > experience > > if > > > > >> > mirrormaker starts crashing at some point after it's already > > running > > > > >> due to > > > > >> > an incompatible broker version. > > > > >> > > > > > >> > > > > >> This would only happen if the user explicitly requests the strict > > > > required > > > > >> ("non fallback") mode. There are many reasons why one may want > this: > > > say > > > > >> you want to be sure that your system is not susceptible to the > > > > >> "alterConfigs" problems or you want to write a test that fails if > > > > >> "alterConfigs' is used. > > > > >> > > > > >> > > > > >> > If the incrementalAlterConfig request fails because the target > > > cluster > > > > >> is > > > > >> > running an older version, then we could log a WARN message that > > says > > > > >> > something like "The config to use incrementalAlterConfig API > for > > > > >> syncing > > > > >> > topic configurations has been set to true however target cluster > > is > > > > >> running > > > > >> > incompatible version therefore using the legacy alterConfig > API". > > > This > > > > >> way > > > > >> > the Mirrormaker never has to stop working and makes the user > aware > > > of > > > > >> > what's being used. > > > > >> > > > > >> > > > > >> Logging statements are not great as the sole mechanism (although > > > useful > > > > as > > > > >> a complementary one) since one cannot easily test against them and > > > > they're > > > > >> often missed alongside all the other logging statements. > > > > >> > > > > >> > > > > >> > In this case, we also would not need 3 separate values > > > > >> > for the config, instead would use the original true or false > > values: > > > > >> > - true - > would use incrementalAlterConfig API, but if it's > > > > >> unavailable, > > > > >> > fallback to the legacy API > > > > >> > - false -> keep using the legacy API > > > > >> > > > > > >> > Set the flag to true by default and remove the config in 4.0. > > > > >> > > > > > >> > > > > >> But this means you have different behavior depending on the > > > > >> supported versions forever. Even though the config values are > > simpler, > > > > >> it's > > > > >> harder to reason about the behavior. > > > > >> > > > > >> > One suggestion: I'm not sure how concerned you are about > people's > > > > >> ability > > > > >> > to migrate, but if you want to make it as smooth as possible, > you > > > > could > > > > >> add > > > > >> > one more step. In the 4.0 release, while removing > > > > >> > `use.incremental.alter.configs`, you can also add > > > > >> > `use.legacy.alter.configs` to give users a path to continue > using > > > the > > > > >> old > > > > >> > behavior even after the default changes. > > > > >> > > > > > >> > If we implement the fallback logic as Ismael suggested above, I > > > think > > > > we > > > > >> > would not need this extra flag later anymore. > > > > >> > > > > > >> > Please let me know what you think. Then I can go ahead and > update > > > the > > > > >> KIP. > > > > >> > > > > >> > > > > >> IncrementalAlterConfigs has been around since Apache Kafka 2.3, > > > released > > > > >> in > > > > >> June 2019. By the time Apache Kafka 4.0 is released, it will have > > been > > > > at > > > > >> least 4.5 years. I think it's reasonable to set it as the new > > baseline > > > > and > > > > >> not maintain the fallback code. The key benefit is having behavior > > > that > > > > is > > > > >> easy to reason about. > > > > >> > > > > >> Ismael > > > > >> > > > > > > > > > > > > > > >