I updated the KIP to reflect the options we have been discussing since Oct
2021 for people who didn't read the discussion thread.

On Wed, May 11, 2022 at 11:07 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
wrote:

> Hi Colin,
> I don't mind the idea of MM2 users implementing the AdminClient interface.
> However, there're two disadvantages to this.
>
>    1. Having around 70 methods definitions to have "NotImplemented" is
>    one downside, and keep up with these if the AdminClient interface changes.
>    2. It makes it hard to list what admin functionality MM2 uses as MM2
>    interactions with AdminClient in the codebase are in many places.
>
> I guess it's OK for MM2 users who want to build their admin client to
> carry this burden, as I explained in my previous response to the discussion
> thread. And we can do some cleanup to the codebase to have all Admin
> interactions in MM2 in a utils class or something like that to make it
> easier to navigate what MM2 needs from the Admin interface.
>
> Maybe I'm misunderstanding the use-case you're describing here. But it
>> seems to me that if you create a proxy that has the ability to do any admin
>> operation, and give MM2 access to that proxy, the security model is the
>> same as just giving MM2 admin access. (Or it may be worse if the sysadmin
>> doesn't know what this proxy is doing, and doesn't lock it down...)
>>
>
> MM2 runs with the assumption that it has
>
>    - "CREATE" ACLs for topics on the source clusters to create
>    `heartbeat` topics.
>    - "CREATE"  and "ALTER" ACLs to create topics, add partitions, update
>    topics' config and topics' ACLs (in future, will also include group ACLS as
>    Mikael mentioned before in the thread) on the destination clusters.
>
> Most organisations have some resource management or federated solutions
> (some would even have a budget system as part of these systems) to manage
> Kafka resources, and these systems are usually the only application allowed
> to initializing a client with "CREATE" and "ALTER" ACLs. They don't grant
> these ACLs to any other teams/groups/applications to create such a client
> outside these systems, so assuming MM2 can bypass these systems and use the
> AdminClient directly to create/update resources isn't valid. This is the
> primary concern here.
>
> The KIP is trying to give MM2 more flexibility to allow organisations to
> integrate MM2 with their resource management system as they see fit without
> forcing them to disable most MM2 features.
>
> Hope this make sense and clear it up.
>
>
> On Wed, May 11, 2022 at 9:09 PM Colin McCabe <cmcc...@apache.org> wrote:
>
>> Hi Omnia Ibrahim,
>>
>> I'm sorry, but I am -1 on adding competing Admin interfaces. This would
>> create confusion and a heavier maintenance burden for the project.
>>
>> Since the org.apache.kafka.clients.admin.Admin interface is a Java
>> interface, any third-party software that wants to insert its own
>> implementation of the interface can do so already.
>>
>> A KIP to make the Admin class used pluggable for MM2 would be reasonable.
>> Adding a competing admin API is not.
>>
>> It's true that there are many Admin methods, but you do not need to
>> implement all of them -- just the ones that MirrorMaker uses. The other
>> ones can throw a NotImplementedException or similar.
>>
>> > The current approach also assumes that the user running MM2 has the
>> Admin right to
>> > create/update topics, which is only valid if the user who runs MM2 also
>> manages both
>> > source and destination clusters.
>>
>> Maybe I'm misunderstanding the use-case you're describing here. But it
>> seems to me that if you create a proxy that has the ability to do any admin
>> operation, and give MM2 access to that proxy, the security model is the
>> same as just giving MM2 admin access. (Or it may be worse if the sysadmin
>> doesn't know what this proxy is doing, and doesn't lock it down...)
>>
>> best,
>> Colin
>>
>>
>> On Mon, May 9, 2022, at 13:21, Omnia Ibrahim wrote:
>> > Hi, I gave the KIP another look after talking to some people at the
>> Kafka
>> > Summit in London. And I would like to clear up the motivation of this
>> KIP.
>> >
>> >
>> > At the moment, MM2 has some opinionated decisions that are creating
>> issues
>> > for teams that use IaC, federated solutions or have a capacity/budget
>> > planning system for Kafka destination clusters. To explain it better,
>> let's
>> > assume we have MM2 with the following configurations to highlight these
>> > problems.
>> >
>> > ```
>> >
>> > topics = .*
>> >
>> > refresh.topics.enabled = true
>> >
>> > sync.topic.configs.enabled = true
>> >
>> > sync.topic.acls.enabled = true
>> >
>> > // Maybe in futrue we can have sync.group.acls.enabled = true
>> >
>> > ```
>> >
>> >
>> > These configurations allow us to run MM2 with the value of its full
>> > features. However, there are two main concerns when we run on a scale
>> with
>> > these configs:
>> >
>> > 1. *Capacity/Budgeting Planning:*
>> >
>> > Functionality or features that impact capacity planning using MM2 are:
>> >
>> >    1. MM2 automatically creates topics (breaking the rule of
>> >    `auto.create.topics.enable=false`) and creates topic partitions on
>> >    destination clusters if the number of partitions increases on the
>> source.
>> >    In the previous example, this functionality will apply to any topic
>> that
>> >    matches the regex of the `topics` config.
>> >    2. Sync topic configs include configurations that impact capacity
>> like `
>> >    retention.ms` and `retention.bytes`.
>> >
>> > These 2 points lead to adding new untracked capacity to destination
>> > clusters without a way to count for them up-front or safeguard the
>> cluster.
>> > The team that runs the cluster will only see the capacity issue when
>> their
>> > disk usage hits the threshold for their alerts. The desk capacity issue
>> can
>> > be avoided if MM2 is flexible enough to
>> >
>> >    - have a way for teams that run their ecosystem to have MM2 behave
>> >    within their system.
>> >    - disable the auto-creation and avoid syncing configs that impact
>> >    capacity
>> >
>> >
>> > 2. *Provisioning conflict:*
>> >
>> > In the previous MM2 configurations; we ended up with conflict as MM2
>> used
>> > `AdminClient` directly to perform the following functionality
>> >
>> >    -  Create a Kafka topic (no way to disable this at the moment)
>> >    -  Add new Kafka partitions (no way to disable this at the moment)
>> >    -  Sync Kafka Topic configurations (can be disabled, but then this
>> >    reduces the value of MM2 potential for users)
>> >    -  Sync Kafka topic's ACLs (can be disabled, but this reduces the
>> users'
>> >    value). Disabling this feature also means that users must ensure
>> they have
>> >    the right ACLs to the mirrored topics on the destination cluster
>> before
>> >    switching their consumers, especially when MM2 is used for disaster
>> >    recovery. It may lead to extra downtime for them.
>> >
>> >
>> > All these functionalities are using AdminClient; which causes an issue
>> with
>> > teams that
>> >
>> >    - Manage their Kafka resources using tools like Strimizi or custom
>> >    federated solutions. For example, Strimizi's UserOperator doesn't
>> sync the
>> >    topic ACLs when MM2 is enabled. Strimzi documentation mentions that
>> users
>> >    must to disable MM2 `sync.topic.acls.enabled` if they use
>> `UserOperator`.
>> >    On the other hand, Strimizi's TopicOperator doesn't have the same
>> issue
>> >    because it has a bi-directional reconciliation process that watches
>> the
>> >    topics state on the Kafka cluster and updates KafkaTopic resources
>> for
>> >    Strimzi. This design works fine with Kafka MM2 for Topics but not for
>> >    syncing ACLs. Strimizi TopicOperator also doesn't have a way to stop
>> >    syncing config that impact capacity for example retention configs.
>> >
>> >
>> >    - Teams that run MM2 but don't own the destination cluster. In this
>> >    case, these teams don't have Admin access, but they may have Kafka
>> >    management solutions, such as yahoo/CMAK or an in-house solution.
>> For such
>> >    a tool as CMAK, these teams can update/create resources using CMAK
>> REST API.
>> >
>> >
>> > The Proposed KIP gives users the flexibility to integrate MM2 within
>> their
>> > ecosystem without disabling any MM2 features. We can achieve this
>> > flexibility with one of the following solutions.
>> >
>> >    1. Introduce a new interface that hides Admin interactions in one
>> place.
>> >    Then users can provide their way of resource management. As well as
>> clean
>> >    up the MM2 code by having one place that manages the resources, as
>> at the
>> >    moment, MM2 usage of AdminClient is all over the code.
>> >    2. The second solution could be to add only a new config that allows
>> the
>> >    users to override AdminClient with another implementation of the
>> >    AdminClient interface, as Ryanne suggested before. The downside is
>> that
>> >    AdminClient is enormous and constantly under development, so any
>> users who
>> >    opt-in for custom implementation will need to carry this burden.
>> >
>> > I favour the first solution as it will make it either later to add any
>> new
>> > feature related to resource management. But don't mind if others think
>> that
>> > the second solution is easier for MM2 design.
>> >
>> >
>> > *Note*: There are two possible future KIPs following this KIP to
>> >
>> >    1. Add config to disable MM2 from auto creating or adding new topic
>> >    partitions.
>> >    2. Add a way to exclude a specific topic's configuration from being
>> >    synced.
>> >
>> >
>> > I hope this clears up the problem better. Please let me know what do you
>> > think.
>> >
>> >
>> > On Mon, Feb 7, 2022 at 3:26 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
>> > wrote:
>> >
>> >> Hi Mickael. Thanks for the feedback. I address some of your points
>> below.
>> >>
>> >> *> This seems to address a relatively advanced and specific use case*
>> >> The main point of the KIP is that MM2 is making a massive assumption
>> that
>> >> it has the right to alter/create resources. This assumption isn't
>> valid in
>> >> the world of Infra-as-Code, federated solutions and popularity of OS
>> Kafka
>> >> Kubernetes operators; these infra/resource management solutions aren't
>> >> advanced use-cases anymore nowadays. These concerns had been raised in
>> the
>> >> past, especially regarding the assumption that MM2 can create topics
>> on the
>> >> destination cluster. For example,
>> >> https://issues.apache.org/jira/browse/KAFKA-12753 and
>> >> https://www.mail-archive.com/dev@kafka.apache.org/msg119340.html.
>> >> The primary motivation is giving some power to data
>> >> platform/infrastructure team to make MM2 part of their internal Kafka
>> >> ecosystem without dropping the main features that make MM2 valuable,
>> like
>> >> syncing topic configs. For example, if someone uses any OS Kafka k8s
>> >> operator, they can implement the class to interact with the k8s
>> operator to
>> >> create these resources.
>> >>
>> >> *> My initial concern is this may make it hard to evolve MirrorMaker as
>> >> we'll likely need to update this new interface if new features are
>> added.*
>> >> I agree it's a disadvantage to adding a new interface however adding
>> more
>> >> admin interactions from MM2 to alter/create resources and access will
>> feed
>> >> the main issue as I mentioned above with the popularity of IaC and
>> >> federated solutions; most data platform/infrastructure teams will endup
>> >> disabling these new features.
>> >> Also, at the moment, most MM2 interactions with the admin client are
>> >> scattered across the codebase so having one place where all admin
>> >> interactions are listed isn't a bad thing.
>> >>
>> >> *> For example if we wanted to sync group ACLs.*
>> >> As I mentioned before, altering any resource's configurations with MM2
>> is
>> >> the one main concern for any data platform/infrastructure team that
>> wants
>> >> to have control over their clusters and use MM2. So the main question
>> with
>> >> adding any new altering feature like sync group ACLs will raise the
>> same
>> >> question of how many teams will actually enable this feature.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> *>Regarding the proposed API, I have a few suggestions: >- What about
>> >> using configure() instead of the constructor to pass the
>> >configuration,
>> >> especially as it's implementing Configurable >- It's not clear what
>> all the
>> >> arguments of createTopicPartitions()>are. What's the difference between
>> >> partitionCounts and newPartitions?>Should we have separate methods for
>> >> creating topics and partitions? >- Do we really need
>> >> createCompactedTopic()? >- Instead of updateTopicConfigs() and
>> updateAcls()
>> >> should we use the >"alter" prefix to stay consistent with Admin?*
>> >>
>> >> These are good suggestions that will update the KIP to address these.
>> >> Regarding the `createCompactedTopic` MM2 is using this method to create
>> >> internal topics.
>> >>
>> >> Thanks
>> >>
>> >> On Wed, Jan 26, 2022 at 1:55 PM Mickael Maison <
>> mickael.mai...@gmail.com>
>> >> wrote:
>> >>
>> >>> Hi Omnia,
>> >>>
>> >>> Thanks for the KIP, sorry for taking so long to comment. I've only had
>> >>> time to take a quick look so far.
>> >>>
>> >>> This seems to address a relatively advanced and specific use case. My
>> >>> initial concern is this may make it hard to evolve MirrorMaker as
>> >>> we'll likely need to update this new interface if new features are
>> >>> added. For example if we wanted to sync group ACLs.
>> >>> I'm wondering if it's something you've thought about. I'm not saying
>> >>> it's a blocker but we have to weigh the pros and cons when introducing
>> >>> new features.
>> >>>
>> >>> Regarding the proposed API, I have a few suggestions:
>> >>> - What about using configure() instead of the constructor to pass the
>> >>> configuration, especially as it's implementing Configurable
>> >>> - It's not clear what all the arguments of createTopicPartitions()
>> >>> are. What's the difference between partitionCounts and newPartitions?
>> >>> Should we have separate methods for creating topics and partitions?
>> >>> - Do we really need createCompactedTopic()?
>> >>> - Instead of updateTopicConfigs() and updateAcls() should we use the
>> >>> "alter" prefix to stay consistent with Admin?
>> >>>
>> >>> Thanks,
>> >>> Mickael
>> >>>
>> >>> On Wed, Jan 26, 2022 at 11:26 AM Omnia Ibrahim <
>> o.g.h.ibra...@gmail.com>
>> >>> wrote:
>> >>> >
>> >>> > Hi,
>> >>> > If there are no more concerns regarding the proposal can I get some
>> >>> votes on the KIP here
>> >>> https://lists.apache.org/thread/950lpxjb5kbjm8qdszlpxm9h4j4sfyjx
>> >>> >
>> >>> > Thanks
>> >>> >
>> >>> > On Wed, Oct 27, 2021 at 3:54 PM Ryanne Dolan <ryannedo...@gmail.com
>> >
>> >>> wrote:
>> >>> >>
>> >>> >> Well I'm convinced! Thanks for looking into it.
>> >>> >>
>> >>> >> Ryanne
>> >>> >>
>> >>> >> On Wed, Oct 27, 2021, 8:49 AM Omnia Ibrahim <
>> o.g.h.ibra...@gmail.com>
>> >>> wrote:
>> >>> >>
>> >>> >> > I checked the difference between the number of methods in the
>> Admin
>> >>> >> > interface and the number of methods MM2 invokes from Admin, and
>> this
>> >>> diff
>> >>> >> > is enormous (it's more than 70 methods).
>> >>> >> > As far as I can see, the following methods MM2 depends on (in
>> >>> >> > MirrorSourceConnector, MirrorMaker, MirrorCheckpointTask and
>> >>> >> > MirrorCheckpointConnector), this will leave 73 methods on the
>> Admin
>> >>> >> > interface that customer will need to return dummy data for,
>> >>> >> >
>> >>> >> >    1. create(conf)
>> >>> >> >    2. close
>> >>> >> >    3. listTopics()
>> >>> >> >    4. createTopics(newTopics, createTopicsOptions)
>> >>> >> >    5. createPartitions(newPartitions)
>> >>> >> >    6. alterConfigs(configs)  // this method is marked for
>> >>> deprecation in
>> >>> >> >    Admin and the ConfigResource MM2 use is only TOPIC
>> >>> >> >    7. createAcls(aclBindings) // the list of bindings always
>> >>> filtered by
>> >>> >> >    TOPIC
>> >>> >> >    8. describeAcls(aclBindingFilter) // filter is always
>> >>> ANY_TOPIC_ACL
>> >>> >> >    9. describeConfigs(configResources) // Always for TOPIC
>> resources
>> >>> >> >    10. listConsumerGroupOffsets(groupId)
>> >>> >> >    11. listConsumerGroups()
>> >>> >> >    12. alterConsumerGroupOffsets(groupId, offsets)
>> >>> >> >    13. describeCluster() // this is invoked from
>> >>> >> > ConnectUtils.lookupKafkaClusterId(conf),
>> >>> >> >    but MM2 isn't the one that initialize the AdminClient
>> >>> >> >
>> >>> >> > Going with the Admin interface in practice will make any custom
>> Admin
>> >>> >> > implementation humongous even for a fringe use case because of
>> the
>> >>> number
>> >>> >> > of methods that need to return dummy data,
>> >>> >> >
>> >>> >> > I am still leaning toward a new interface as it abstract all
>> MM2's
>> >>> >> > interaction with Kafka Resources in one place; this makes it
>> easier
>> >>> to
>> >>> >> > maintain and make it easier for the use cases where customers
>> wish to
>> >>> >> > provide a different method to handle resources.
>> >>> >> >
>> >>> >> > Omnia
>> >>> >> >
>> >>> >> > On Tue, Oct 26, 2021 at 4:10 PM Ryanne Dolan <
>> ryannedo...@gmail.com>
>> >>> >> > wrote:
>> >>> >> >
>> >>> >> > > I like the idea of failing-fast whenever a custom impl is
>> >>> provided, but I
>> >>> >> > > suppose that that could be done for Admin as well. I agree your
>> >>> proposal
>> >>> >> > is
>> >>> >> > > more ergonomic, but maybe it's okay to have a little friction
>> in
>> >>> such
>> >>> >> > > fringe use-cases.
>> >>> >> > >
>> >>> >> > > Ryanne
>> >>> >> > >
>> >>> >> > >
>> >>> >> > > On Tue, Oct 26, 2021, 6:23 AM Omnia Ibrahim <
>> >>> o.g.h.ibra...@gmail.com>
>> >>> >> > > wrote:
>> >>> >> > >
>> >>> >> > > > Hey Ryanne, Thanks fo the quick feedback.
>> >>> >> > > > Using the Admin interface would make everything easier, as
>> MM2
>> >>> will
>> >>> >> > need
>> >>> >> > > > only to configure the classpath for the new implementation
>> and
>> >>> use it
>> >>> >> > > > instead of AdminClient.
>> >>> >> > > > However, I have two concerns
>> >>> >> > > > 1. The Admin interface is enormous, and the MM2 users will
>> need
>> >>> to know
>> >>> >> > > the
>> >>> >> > > > list of methods MM2 depends on and override these only
>> instead of
>> >>> >> > > > implementing the whole Admin interface.
>> >>> >> > > > 2. MM2 users will need keep an eye on any changes to Admin
>> >>> interface
>> >>> >> > that
>> >>> >> > > > impact MM2 for example deprecating methods.
>> >>> >> > > > Am not sure if adding these concerns on the users is
>> acceptable
>> >>> or not.
>> >>> >> > > > One solution to address these concerns could be adding some
>> >>> checks to
>> >>> >> > > make
>> >>> >> > > > sure the methods MM2 uses from the Admin interface exists to
>> fail
>> >>> >> > faster.
>> >>> >> > > > What do you think
>> >>> >> > > >
>> >>> >> > > > Omnia
>> >>> >> > > >
>> >>> >> > > >
>> >>> >> > > > On Mon, Oct 25, 2021 at 11:24 PM Ryanne Dolan <
>> >>> ryannedo...@gmail.com>
>> >>> >> > > > wrote:
>> >>> >> > > >
>> >>> >> > > > > Thanks Omnia, neat idea. I wonder if we could use the
>> existing
>> >>> Admin
>> >>> >> > > > > interface instead of defining a new one?
>> >>> >> > > > >
>> >>> >> > > > > Ryanne
>> >>> >> > > > >
>> >>> >> > > > > On Mon, Oct 25, 2021, 12:54 PM Omnia Ibrahim <
>> >>> >> > o.g.h.ibra...@gmail.com>
>> >>> >> > > > > wrote:
>> >>> >> > > > >
>> >>> >> > > > > > Hey everyone,
>> >>> >> > > > > > Please take a look at KIP-787
>> >>> >> > > > > >
>> >>> >> > > > > >
>> >>> >> > > > >
>> >>> >> > > >
>> >>> >> > >
>> >>> >> >
>> >>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-787%3A+MM2+Interface+to+manage+Kafka+resources
>> >>> >> > > > > > <
>> >>> >> > > > > >
>> >>> >> > > > >
>> >>> >> > > >
>> >>> >> > >
>> >>> >> >
>> >>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-787%3A+MM2+Interface+to+manage+Kafka+resources
>> >>> >> > > > > > >
>> >>> >> > > > > >
>> >>> >> > > > > > Thanks for the feedback and support
>> >>> >> > > > > > Omnia
>> >>> >> > > > > >
>> >>> >> > > > >
>> >>> >> > > >
>> >>> >> > >
>> >>> >> >
>> >>>
>> >>
>>
>

Reply via email to