Hey, sorry for arriving late, but I have a question about the rejected
alternative involving configuration.

I may be missing something, but we are looking at 3 new configs (one for
each topic). And this rejected alternative is basically identical to what
Connect already does (you can choose names for internal topics using
configs). How is it too complex? As a user, configuring 3 fields seems much
simpler than implementing a class. As an admin, trusting to run someone's
code is scary, but a config with topic name is very safe and easy to test
and trust.

There is a discussion about how this gets more complicated in some
scenarios, but it was a bit abstract - is there a concrete case that shows
why configuration is more complicated than implementing a plugin?

Gwen

On Fri, Apr 30, 2021 at 2:30 AM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
wrote:

> Hi Ryanne, Can you vote for it here
> https://www.mail-archive.com/dev@kafka.apache.org/msg113575.html as well,
> please?
>
> On Fri, Apr 30, 2021 at 12:44 AM Ryanne Dolan <ryannedo...@gmail.com>
> wrote:
>
> > Thanks Omnia. lgtm!
> >
> > Ryanne
> >
> > On Thu, Apr 29, 2021 at 10:50 AM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
> > wrote:
> >
> >> I updated the KIP
> >>
> >> On Thu, Apr 29, 2021 at 4:43 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
> >> wrote:
> >>
> >>> Sure, this would make it easier, we can make these functions returns
> the
> >>> original behaviour (<clusterAlias>.checkpoints.internal,
> >>> "mm2-offset-syncs.<clusterAlias>.internal", heartbeat) without any
> >>> customisation using `replication.policy.separator` and use the
> separator in
> >>> the DefaultReplicationPolicy
> >>>
> >>> On Wed, Apr 28, 2021 at 1:31 AM Ryanne Dolan <ryannedo...@gmail.com>
> >>> wrote:
> >>>
> >>>> Thanks Omnia, makes sense to me.
> >>>>
> >>>> > Customers who have their customised ReplicationPolicy will need to
> >>>> add the definition of their internal topics naming convention
> >>>>
> >>>> I wonder should we include default impls in the interface to avoid
> that
> >>>> requirement?
> >>>>
> >>>> Ryanne
> >>>>
> >>>> On Sun, Apr 25, 2021, 2:20 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Hi Mickael and Ryanne,
> >>>>> I updated the KIP to add these methods to the ReplicationPolicy
> >>>>> instead of an extra interface to simplify the changes. Please have a
> look
> >>>>> and let me know your thoughts.
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>> On Tue, Mar 30, 2021 at 7:19 PM Omnia Ibrahim <
> o.g.h.ibra...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> *(sorry forgot to Replay to All) *
> >>>>>> Hi Ryanne,
> >>>>>> It's a valid concern, I was trying to separate the concerns of
> >>>>>> internal and replicated policy away from each other and to make the
> code
> >>>>>> readable as extending ReplicationPolicy to manage both internal and
> >>>>>> replicated topic is a bit odd. Am not against simplifying things
> out to
> >>>>>> make ReplicationPolicy handling both at the end of the day if an
> MM2 user
> >>>>>> has a special naming convention for topics it will be affecting both
> >>>>>> replicated and MM2 internal topics.
> >>>>>>
> >>>>>> For simplifying things we can extend `ReplicationPolicy` to the
> >>>>>> following instead of adding an extra class
> >>>>>>
> >>>>>>> *public interface ReplicationPolicy {*
> >>>>>>>     String topicSource(String topic);
> >>>>>>>     String upstreamTopic(String topic);
> >>>>>>>
> >>>>>>>
> >>>>>>> */** Returns heartbeats topic name.*/    String heartbeatsTopic();*
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> *    /** Returns the offset-syncs topic for given cluster alias. */
> >>>>>>>   String offsetSyncTopic(String targetAlias);    /** Returns the
> name
> >>>>>>> checkpoint topic for given cluster alias. */    String
> >>>>>>> checkpointTopic(String sourceAlias); *
> >>>>>>>
> >>>>>>>     default String originalTopic(String topic) {
> >>>>>>>         String upstream = upstreamTopic(topic);
> >>>>>>>         if (upstream == null) {
> >>>>>>>             return topic;
> >>>>>>>         } else {
> >>>>>>>             return originalTopic(upstream);
> >>>>>>>         }
> >>>>>>>     }
> >>>>>>>
> >>>>>>>
> >>>>>>> *    /** Internal topics are never replicated. */
> >>>>>>> isInternalTopic(String topic) *//the implementaion will be moved to
> >>>>>>> `DefaultReplicationPolicy` to handle both kafka topics and MM2
> internal
> >>>>>>> topics.
> >>>>>>> }
> >>>>>>>
> >>>>>>
> >>>>>> On Fri, Mar 26, 2021 at 3:05 PM Ryanne Dolan <ryannedo...@gmail.com
> >
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Omnia, have we considered just adding methods to ReplicationPolicy?
> >>>>>>> I'm reluctant to add a new class because, as Mickael points out,
> we'd need
> >>>>>>> to carry it around in client code.
> >>>>>>>
> >>>>>>> Ryanne
> >>>>>>>
> >>>>>>> On Fri, Feb 19, 2021 at 8:31 AM Mickael Maison <
> >>>>>>> mickael.mai...@gmail.com> wrote:
> >>>>>>>
> >>>>>>>> Hi Omnia,
> >>>>>>>>
> >>>>>>>> Thanks for the clarifications.
> >>>>>>>>
> >>>>>>>> - I'm still a bit uneasy with the overlap between these 2 methods
> as
> >>>>>>>> currently `ReplicationPolicy.isInternalTopic` already handles MM2
> >>>>>>>> internal topics. Should we make it only handle Kafka internal
> topics
> >>>>>>>> and `isMM2InternalTopic()` only handle MM2 topics?
> >>>>>>>>
> >>>>>>>> - I'm not sure I understand what this method is used for. There
> are
> >>>>>>>> no
> >>>>>>>> such methods for the other 2 topics (offset-sync and heartbeat).
> >>>>>>>> Also
> >>>>>>>> what happens if there are other MM2 instances using different
> naming
> >>>>>>>> schemes in the same cluster. Do all instances have to know about
> the
> >>>>>>>> other naming schemes? What are the expected issues if they don't?
> >>>>>>>>
> >>>>>>>> - RemoteClusterUtils is a client-side utility so it does not have
> >>>>>>>> access to the MM2 configuration. Since this new API can affect the
> >>>>>>>> name of the checkpoint topic, it will need to be used client-side
> >>>>>>>> too
> >>>>>>>> so users can find the checkpoint topic name. I had to realized
> this
> >>>>>>>> was the case.
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>>
> >>>>>>>> On Mon, Feb 15, 2021 at 9:33 PM Omnia Ibrahim <
> >>>>>>>> o.g.h.ibra...@gmail.com> wrote:
> >>>>>>>> >
> >>>>>>>> > Hi Mickael, did you have some time to check my answer?
> >>>>>>>> >
> >>>>>>>> > On Thu, Jan 21, 2021 at 10:10 PM Omnia Ibrahim <
> >>>>>>>> o.g.h.ibra...@gmail.com> wrote:
> >>>>>>>> >>
> >>>>>>>> >> Hi Mickael,
> >>>>>>>> >> Thanks for taking another look into the KIP, regards your
> >>>>>>>> questions
> >>>>>>>> >>
> >>>>>>>> >> - I believe we need both "isMM2InternalTopic" and
> >>>>>>>> `ReplicationPolicy.isInternalTopic`  as
> `ReplicationPolicy.isInternalTopic`
> >>>>>>>> does check if a topic is Kafka internal topic, while
> `isMM2InternalTopic`
> >>>>>>>> is just focusing if a topic is MM2 internal topic or not(which is
> >>>>>>>> heartbeat/checkpoint/offset-sync). The fact that the default for
> MM2
> >>>>>>>> internal topics matches "ReplicationPolicy.isInternalTopic" will
> not be an
> >>>>>>>> accurate assumption anymore once we implement this KIP.
> >>>>>>>> >>
> >>>>>>>> >> - "isCheckpointTopic" will detect all checkpoint topics for all
> >>>>>>>> MM2 instances this is needed for "MirrorClient.checkpointTopics"
> which
> >>>>>>>> originally check if the topic name ends with
> CHECKPOINTS_TOPIC_SUFFIX. So
> >>>>>>>> this method just to keep the same functionality that originally
> exists in
> >>>>>>>> MM2
> >>>>>>>> >>
> >>>>>>>> >> - "checkpointTopic" is used in two places 1. At topic creation
> >>>>>>>> in "MirrorCheckpointConnector.createInternalTopics" which use
> >>>>>>>> "sourceClusterAlias() + CHECKPOINTS_TOPIC_SUFFIX" and 2. At
> >>>>>>>> "MirrorClient.remoteConsumerOffsets" which is called by
> >>>>>>>> "RemoteClusterUtils.translateOffsets"  the cluster alias here
> referred to
> >>>>>>>> as "remoteCluster" where the topic name is "remoteClusterAlias +
> >>>>>>>> CHECKPOINTS_TOPIC_SUFFIX"  (which is an argument in
> RemoteClusterUtils, not
> >>>>>>>> a config) This why I called the variable cluster instead of
> source and
> >>>>>>>> instead of using the config to figure out the cluster aliases
> from config
> >>>>>>>> as we use checkpoints to keep `RemoteClusterUtils` compatible for
> existing
> >>>>>>>> users. I see a benefit of just read the config a find out the
> cluster
> >>>>>>>> aliases but on the other side, I'm not sure why
> "RemoteClusterUtils"
> >>>>>>>> doesn't get the name of the cluster from the properties instead
> of an
> >>>>>>>> argument, so I decided to keep it just for compatibility.
> >>>>>>>> >>
> >>>>>>>> >> Hope these answer some of your concerns.
> >>>>>>>> >> Best
> >>>>>>>> >> Omnia
> >>>>>>>> >>
> >>>>>>>> >> On Thu, Jan 21, 2021 at 3:37 PM Mickael Maison <
> >>>>>>>> mickael.mai...@gmail.com> wrote:
> >>>>>>>> >>>
> >>>>>>>> >>> Hi Omnia,
> >>>>>>>> >>>
> >>>>>>>> >>> Thanks for the updates. Sorry for the delay but I have a few
> >>>>>>>> more
> >>>>>>>> >>> small questions about the API:
> >>>>>>>> >>> - Do we really need "isMM2InternalTopic()"? There's already
> >>>>>>>> >>> "ReplicationPolicy.isInternalTopic()". If so, we need to
> >>>>>>>> explain the
> >>>>>>>> >>> difference between these 2 methods.
> >>>>>>>> >>>
> >>>>>>>> >>> - Is "isCheckpointTopic()" expected to detect all checkpoint
> >>>>>>>> topics
> >>>>>>>> >>> (for all MM2 instances) or only the ones for this connector
> >>>>>>>> instance.
> >>>>>>>> >>> If it's the later, I wonder if we could do without the method.
> >>>>>>>> As this
> >>>>>>>> >>> interface is only called by MM2, we could first call
> >>>>>>>> >>> "checkpointTopic()" and check if that's equal to the topic
> we're
> >>>>>>>> >>> checking. If it's the former, we don't really know topic names
> >>>>>>>> other
> >>>>>>>> >>> MM2 instances may be using!
> >>>>>>>> >>>
> >>>>>>>> >>> - The 3 methods returning topic names have different APIs:
> >>>>>>>> >>> "heartbeatsTopic()" takes no arguments, "offsetSyncTopic()"
> >>>>>>>> takes the
> >>>>>>>> >>> target cluster alias and "checkpointTopic()" takes
> >>>>>>>> "clusterAlias"
> >>>>>>>> >>> (which one is it? source or target?). As the interface extends
> >>>>>>>> >>> Configurable, maybe we could get rid of all the arguments and
> >>>>>>>> use the
> >>>>>>>> >>> config to find the cluster aliases. WDYT?
> >>>>>>>> >>>
> >>>>>>>> >>> These are minor concerns, just making sure I fully understand
> >>>>>>>> how the
> >>>>>>>> >>> API is expected to be used. Once these are cleared, I'll be
> >>>>>>>> happy to
> >>>>>>>> >>> vote for this KIP.
> >>>>>>>> >>>
> >>>>>>>> >>> Thanks
> >>>>>>>> >>>
> >>>>>>>> >>> On Fri, Jan 8, 2021 at 12:06 PM Omnia Ibrahim <
> >>>>>>>> o.g.h.ibra...@gmail.com> wrote:
> >>>>>>>> >>> >
> >>>>>>>> >>> > Hi Mickael,
> >>>>>>>> >>> > Did you get time to review the changes to the KIP? If you
> >>>>>>>> okay with it could you vote for the KIP here ttps://
> >>>>>>>> www.mail-archive.com/dev@kafka.apache.org/msg113575.html?
> >>>>>>>> >>> > Thanks
> >>>>>>>> >>> >
> >>>>>>>> >>> > On Thu, Dec 10, 2020 at 2:19 PM Omnia Ibrahim <
> >>>>>>>> o.g.h.ibra...@gmail.com> wrote:
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> Hi Mickael,
> >>>>>>>> >>> >> 1) That's right the interface and default implementation
> >>>>>>>> will in mirror-connect
> >>>>>>>> >>> >> 2) Renaming the interface should be fine too especially if
> >>>>>>>> you planning to move other functionality related to the creation
> there, I
> >>>>>>>> can edit this
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> if you are okay with that please vote for the KIP here
> >>>>>>>> https://www.mail-archive.com/dev@kafka.apache.org/msg113575.html
> >>>>>>>> >>> >>
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> Thanks
> >>>>>>>> >>> >> Omnia
> >>>>>>>> >>> >> On Thu, Dec 10, 2020 at 12:58 PM Mickael Maison <
> >>>>>>>> mickael.mai...@gmail.com> wrote:
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> Hi Omnia,
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> Thank you for the reply, it makes sense.
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> A couple more comments:
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> 1) I'm assuming the new interface and default
> >>>>>>>> implementation will be
> >>>>>>>> >>> >>> in the mirror-client project? as the names of some of
> these
> >>>>>>>> topics are
> >>>>>>>> >>> >>> needed by RemoteClusterUtils on the client-side.
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> 2) I'm about to open a KIP to specify where the
> >>>>>>>> offset-syncs topic is
> >>>>>>>> >>> >>> created by MM2. In restricted environments, we'd prefer
> MM2
> >>>>>>>> to only
> >>>>>>>> >>> >>> have read access to the source cluster and have the
> >>>>>>>> offset-syncs on
> >>>>>>>> >>> >>> the target cluster. I think allowing to specify the
> cluster
> >>>>>>>> where to
> >>>>>>>> >>> >>> create that topic would be a natural extension of the
> >>>>>>>> interface you
> >>>>>>>> >>> >>> propose here.
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> So I wonder if your interface could be named
> >>>>>>>> InternalTopicsPolicy?
> >>>>>>>> >>> >>> That's a bit more generic than InternalTopicNamingPolicy.
> >>>>>>>> That would
> >>>>>>>> >>> >>> also match the configuration setting,
> >>>>>>>> internal.topics.policy.class,
> >>>>>>>> >>> >>> you're proposing.
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> Thanks
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> On Thu, Dec 3, 2020 at 10:15 PM Omnia Ibrahim <
> >>>>>>>> o.g.h.ibra...@gmail.com> wrote:
> >>>>>>>> >>> >>> >
> >>>>>>>> >>> >>> > Hi Mickael,
> >>>>>>>> >>> >>> > Thanks for your feedback!
> >>>>>>>> >>> >>> > Regards your question about having more configurations,
> I
> >>>>>>>> considered adding
> >>>>>>>> >>> >>> > configuration per each topic however this meant adding
> >>>>>>>> more configurations
> >>>>>>>> >>> >>> > for MM2 which already have so many, also the more
> >>>>>>>> complicated and advanced
> >>>>>>>> >>> >>> > replication pattern you have between clusters the more
> >>>>>>>> configuration lines
> >>>>>>>> >>> >>> > will be added to your MM2 config which isn't going to be
> >>>>>>>> pretty if you
> >>>>>>>> >>> >>> > don't have the same topics names across your clusters.
> >>>>>>>> >>> >>> >
> >>>>>>>> >>> >>> > Also, it added more complexity to the implementation as
> >>>>>>>> MM2 need to
> >>>>>>>> >>> >>> > 1- identify if a topic is checkpoints so we could list
> >>>>>>>> the checkpoints
> >>>>>>>> >>> >>> > topics in MirrorMaker 2 utils as one cluster could have
> X
> >>>>>>>> numbers
> >>>>>>>> >>> >>> > checkpoints topics if it's connected to X clusters, this
> >>>>>>>> is done right now
> >>>>>>>> >>> >>> > by listing any topic with suffix
> `.checkpoints.internal`.
> >>>>>>>> This could be
> >>>>>>>> >>> >>> > done by add `checkpoints.topic.suffix` config but this
> >>>>>>>> would make an
> >>>>>>>> >>> >>> > assumption that checkpoints will always have a suffix
> >>>>>>>> also having a suffix
> >>>>>>>> >>> >>> > means that we may need a separator as well to
> concatenate
> >>>>>>>> this suffix with
> >>>>>>>> >>> >>> > a prefix to identify source cluster name.
> >>>>>>>> >>> >>> > 2- identify if a topic is internal, so it shouldn't be
> >>>>>>>> replicated or track
> >>>>>>>> >>> >>> > checkpoints for it, right now this is relaying on
> >>>>>>>> disallow topics with
> >>>>>>>> >>> >>> > `.internal` suffix to be not replicated and not tracked
> >>>>>>>> in checkpoints but
> >>>>>>>> >>> >>> > with making topics configurable we need a way to define
> >>>>>>>> what is an internal
> >>>>>>>> >>> >>> > topic. This could be done by making using a list of all
> >>>>>>>> internal topics
> >>>>>>>> >>> >>> > have been entered to the configuration.
> >>>>>>>> >>> >>> >
> >>>>>>>> >>> >>> > So having an interface seemed easier and also give more
> >>>>>>>> flexibility for
> >>>>>>>> >>> >>> > users to define their own topics name, define what is
> >>>>>>>> internal topic means,
> >>>>>>>> >>> >>> > how to find checkpoints topics and it will be one line
> >>>>>>>> config for each
> >>>>>>>> >>> >>> > herder, also it more consistence with MM2 code as MM2
> >>>>>>>> config have
> >>>>>>>> >>> >>> > TopicFilter, ReplicationPolicy, GroupFilter, etc as
> >>>>>>>> interface and they can
> >>>>>>>> >>> >>> > be overridden by providing a custom implementation for
> >>>>>>>> them or have some
> >>>>>>>> >>> >>> > config that change their default implementations.
> >>>>>>>> >>> >>> >
> >>>>>>>> >>> >>> > Hope this answer your question. I also updated the KIP
> to
> >>>>>>>> add this to the
> >>>>>>>> >>> >>> > rejected solutions.
> >>>>>>>> >>> >>> >
> >>>>>>>> >>> >>> >
> >>>>>>>> >>> >>> > On Thu, Dec 3, 2020 at 3:19 PM Mickael Maison <
> >>>>>>>> mickael.mai...@gmail.com>
> >>>>>>>> >>> >>> > wrote:
> >>>>>>>> >>> >>> >
> >>>>>>>> >>> >>> > > Hi Omnia,
> >>>>>>>> >>> >>> > >
> >>>>>>>> >>> >>> > > Thanks for the KIP. Indeed being able to configure
> >>>>>>>> MM2's internal
> >>>>>>>> >>> >>> > > topic names would be a nice improvement.
> >>>>>>>> >>> >>> > >
> >>>>>>>> >>> >>> > > Looking at the KIP, I was surprised you propose an
> >>>>>>>> interface to allow
> >>>>>>>> >>> >>> > > users to specify names. Have you considered making
> >>>>>>>> names changeable
> >>>>>>>> >>> >>> > > via configurations? If so, we should definitely
> mention
> >>>>>>>> it in the
> >>>>>>>> >>> >>> > > rejected alternatives as it's the first method that
> >>>>>>>> comes to mind.
> >>>>>>>> >>> >>> > >
> >>>>>>>> >>> >>> > > I understand an interface gives a lot of flexibility
> >>>>>>>> but I'd expect
> >>>>>>>> >>> >>> > > topic names to be relatively simple and known in
> >>>>>>>> advance in most
> >>>>>>>> >>> >>> > > cases.
> >>>>>>>> >>> >>> > >
> >>>>>>>> >>> >>> > > I've not checked all use cases but something like
> below
> >>>>>>>> felt appropriate:
> >>>>>>>> >>> >>> > > clusters = primary,backup
> >>>>>>>> >>> >>> > >
> >>>>>>>> primary->backup.offsets-sync.topic=backup.mytopic-offsets-sync
> >>>>>>>> >>> >>> > >
> >>>>>>>> >>> >>> > > On Tue, Dec 1, 2020 at 3:36 PM Omnia Ibrahim <
> >>>>>>>> o.g.h.ibra...@gmail.com>
> >>>>>>>> >>> >>> > > wrote:
> >>>>>>>> >>> >>> > > >
> >>>>>>>> >>> >>> > > > Hey everyone,
> >>>>>>>> >>> >>> > > > Please take a look at KIP-690:
> >>>>>>>> >>> >>> > > >
> >>>>>>>> >>> >>> > > >
> >>>>>>>> >>> >>> > >
> >>>>>>>>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-690%3A+Add+additional+configuration+to+control+MirrorMaker+2+internal+topics+naming+convention
> >>>>>>>> >>> >>> > > >
> >>>>>>>> >>> >>> > > > Thanks for your feedback and support.
> >>>>>>>> >>> >>> > > >
> >>>>>>>> >>> >>> > > > Omnia
> >>>>>>>> >>> >>> > > >
> >>>>>>>> >>> >>> > >
> >>>>>>>>
> >>>>>>>
>


-- 
Gwen Shapira
Engineering Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog

Reply via email to