Re: [DISCUSS] KIP-382: MirrorMaker 2.0

Alex Mironov Tue, 23 Oct 2018 04:07:12 -0700

Hey Ryanne,

Awesome KIP, exited to see improvements in MirrorMaker land, I particularly
like the reuse of Connect framework! Would it be possible to utilize the
Message Headers feature to prevent infinite recursion? For example, MM2
could stamp every message with a special header payload (e.g.
MM2="cluster-name-foo") so in case another MM2 instance sees this message
and it is configured to replicate data into "cluster-name-foo" it would
just skip it instead of replicating it back.


On Sat, Oct 20, 2018 at 5:48 AM Ryanne Dolan <ryannedo...@gmail.com> wrote:

> Thanks Harsha. Done.
>
> On Fri, Oct 19, 2018 at 1:03 AM Harsha Chintalapani <ka...@harsha.io>
> wrote:
>
> > Ryanne,
> >        Makes sense. Can you please add this under rejected alternatives
> so
> > that everyone has context on why it  wasn’t picked.
> >
> > Thanks,
> > Harsha
> > On Oct 18, 2018, 8:02 AM -0700, Ryanne Dolan <ryannedo...@gmail.com>,
> > wrote:
> >
> > Harsha, concerning uReplicator specifically, the project is a major
> > inspiration for MM2, but I don't think it is a good foundation for
> anything
> > included in Apache Kafka. uReplicator uses Helix to solve problems that
> > Connect also solves, e.g. REST API, live configuration changes, cluster
> > management, coordination etc. This also means that existing tooling,
> > dashboards etc that work with Connectors do not work with uReplicator,
> and
> > any future tooling would need to treat uReplicator as a special case.
> >
> > Ryanne
> >
> > On Wed, Oct 17, 2018 at 12:30 PM Ryanne Dolan <ryannedo...@gmail.com>
> > wrote:
> >
> >> Harsha, yes I can do that. I'll update the KIP accordingly, thanks.
> >>
> >> Ryanne
> >>
> >> On Wed, Oct 17, 2018 at 12:18 PM Harsha <ka...@harsha.io> wrote:
> >>
> >>> Hi Ryanne,
> >>>                Thanks for the KIP. I am also curious about why not use
> >>> the uReplicator design as the foundation given it alreadys resolves
> some of
> >>> the fundamental issues in current MIrrorMaker, updating the confifgs
> on the
> >>> fly and running the mirror maker agents in a worker model which can
> >>> deployed in mesos or container orchestrations.  If possible can you
> >>> document in the rejected alternatives what are missing parts that made
> you
> >>> to consider a new design from ground up.
> >>>
> >>> Thanks,
> >>> Harsha
> >>>
> >>> On Wed, Oct 17, 2018, at 8:34 AM, Ryanne Dolan wrote:
> >>> > Jan, these are two separate issues.
> >>> >
> >>> > 1) consumer coordination should not, ideally, involve unreliable or
> >>> slow
> >>> > connections. Naively, a KafkaSourceConnector would coordinate via the
> >>> > source cluster. We can do better than this, but I'm deferring this
> >>> > optimization for now.
> >>> >
> >>> > 2) exactly-once between two clusters is mind-bending. But keep in
> mind
> >>> that
> >>> > transactions are managed by the producer, not the consumer. In fact,
> >>> it's
> >>> > the producer that requests that offsets be committed for the current
> >>> > transaction. Obviously, these offsets are committed in whatever
> >>> cluster the
> >>> > producer is sending to.
> >>> >
> >>> > These two issues are closely related. They are both resolved by not
> >>> > coordinating or committing via the source cluster. And in fact, this
> >>> is the
> >>> > general model of SourceConnectors anyway, since most SourceConnectors
> >>> > _only_ have a destination cluster.
> >>> >
> >>> > If there is a lot of interest here, I can expound further on this
> >>> aspect of
> >>> > MM2, but again I think this is premature until this first KIP is
> >>> approved.
> >>> > I intend to address each of these in separate KIPs following this
> one.
> >>> >
> >>> > Ryanne
> >>> >
> >>> > On Wed, Oct 17, 2018 at 7:09 AM Jan Filipiak <
> jan.filip...@trivago.com
> >>> >
> >>> > wrote:
> >>> >
> >>> > > This is not a performance optimisation. Its a fundamental design
> >>> choice.
> >>> > >
> >>> > >
> >>> > > I never really took a look how streams does exactly once. (its a
> trap
> >>> > > anyways and you usually can deal with at least once donwstream
> pretty
> >>> > > easy). But I am very certain its not gonna get somewhere if offset
> >>> > > commit and record produce cluster are not the same.
> >>> > >
> >>> > > Pretty sure without this _design choice_ you can skip on that
> exactly
> >>> > > once already
> >>> > >
> >>> > > Best Jan
> >>> > >
> >>> > > On 16.10.2018 18:16, Ryanne Dolan wrote:
> >>> > > >  >  But one big obstacle in this was
> >>> > > > always that group coordination happened on the source cluster.
> >>> > > >
> >>> > > > Jan, thank you for bringing up this issue with legacy
> MirrorMaker.
> >>> I
> >>> > > > totally agree with you. This is one of several problems with
> >>> MirrorMaker
> >>> > > > I intend to solve in MM2, and I already have a design and
> >>> prototype that
> >>> > > > solves this and related issues. But as you pointed out, this KIP
> is
> >>> > > > already rather complex, and I want to focus on the core feature
> set
> >>> > > > rather than performance optimizations for now. If we can agree on
> >>> what
> >>> > > > MM2 looks like, it will be very easy to agree to improve its
> >>> performance
> >>> > > > and reliability.
> >>> > > >
> >>> > > > That said, I look forward to your support on a subsequent KIP
> that
> >>> > > > addresses consumer coordination and rebalance issues. Stay tuned!
> >>> > > >
> >>> > > > Ryanne
> >>> > > >
> >>> > > > On Tue, Oct 16, 2018 at 6:58 AM Jan Filipiak <
> >>> jan.filip...@trivago.com
> >>> > > > <mailto:jan.filip...@trivago.com>> wrote:
> >>> > > >
> >>> > > >     Hi,
> >>> > > >
> >>> > > >     Currently MirrorMaker is usually run collocated with the
> target
> >>> > > >     cluster.
> >>> > > >     This is all nice and good. But one big obstacle in this was
> >>> > > >     always that group coordination happened on the source
> cluster.
> >>> So
> >>> > > when
> >>> > > >     then network was congested, you sometimes loose group
> >>> membership and
> >>> > > >     have to rebalance and all this.
> >>> > > >
> >>> > > >     So one big request from we would be the support of having
> >>> > > coordination
> >>> > > >     cluster != source cluster.
> >>> > > >
> >>> > > >     I would generally say a LAN is better than a WAN for doing
> >>> group
> >>> > > >     coordinaton and there is no reason we couldn't have a group
> >>> consuming
> >>> > > >     topics from a different cluster and committing offsets to
> >>> another
> >>> > > >     one right?
> >>> > > >
> >>> > > >     Other than that. It feels like the KIP has too much features
> >>> where
> >>> > > many
> >>> > > >     of them are not really wanted and counter productive but I
> >>> will just
> >>> > > >     wait and see how the discussion goes.
> >>> > > >
> >>> > > >     Best Jan
> >>> > > >
> >>> > > >
> >>> > > >     On 15.10.2018 18:16, Ryanne Dolan wrote:
> >>> > > >      > Hey y'all!
> >>> > > >      >
> >>> > > >      > Please take a look at KIP-382:
> >>> > > >      >
> >>> > > >      >
> >>> > > >
> >>> > >
> >>>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0
> >>> > > >      >
> >>> > > >      > Thanks for your feedback and support.
> >>> > > >      >
> >>> > > >      > Ryanne
> >>> > > >      >
> >>> > > >
> >>> > >
> >>>
> >>
>


-- 
Best,
Alex Mironov

Re: [DISCUSS] KIP-382: MirrorMaker 2.0

Reply via email to