Hello everyone,

I've drafted a FLIP that describes the current design of the Pulsar connector:

https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#

Please take a look and let me know what you think.

Thanks,
Yijie

On Sat, Sep 14, 2019 at 12:08 AM Rong Rong <walter...@gmail.com> wrote:
>
> Hi All,
>
> Sorry for joining the discussion late and thanks Yijie & Sijie for driving
> the discussion.
> I also think the Pulsar connector would be a very valuable addition to
> Flink. I can also help out a bit on the review side :-)
>
> Regarding the timeline, I also share concerns with Becket on the
> relationship between the new Pulsar connector and FLIP-27.
> There's also another discussion just started by Stephan on dropping Kafka
> 9/10 support on next Flink release [1].  Although the situation is somewhat
> different, and Kafka 9/10 connector has been in Flink for almost 3-4 years,
> based on the discussion I am not sure if a major version release is a
> requirement for removing old connector supports.
>
> I think there shouldn't be a blocker if we agree the old connector will be
> removed once FLIP-27 based Pulsar connector is there. As Stephan stated, it
> is easier to contribute the source sooner and adjust it later.
> We should also ensure we clearly communicate the message: for example,
> putting an experimental flag on the pre-FLIP27 connector page of the
> website, documentations, etc. Any other thoughts?
>
> --
> Rong
>
> [1]
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/DISCUSS-Drop-older-versions-of-Kafka-Connectors-0-9-0-10-for-Flink-1-10-td29916.html
>
>
> On Fri, Sep 13, 2019 at 8:15 AM Becket Qin <becket....@gmail.com> wrote:
>
> > Technically speaking, removing the old connector code is a backwards
> > incompatible change which requires a major version bump, i.e. Flink 2.x.
> > Given that we don't have a clear plan on when to have the next major
> > version release, it seems unclear how long the old connector code will be
> > there if we check it in right now. Or will we remove the old connector
> > without a major version bump? In any case, it sounds not quite user
> > friendly to the those who might use the old Pulsar connector. I am not sure
> > if it is worth these potential problems in order to have the Pulsar source
> > connector checked in one or two months earlier.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Thu, Sep 12, 2019 at 3:52 PM Stephan Ewen <se...@apache.org> wrote:
> >
> > > Agreed, if we check in the old code, we should make it clear that it will
> > > be removed as soon as the FLIP-27 based version of the connector is
> > there.
> > > We should not commit to maintaining the old version, that would be indeed
> > > too much overhead.
> > >
> > > On Thu, Sep 12, 2019 at 3:30 AM Becket Qin <becket....@gmail.com> wrote:
> > >
> > > > Hi Stephan,
> > > >
> > > > Thanks for the volunteering to help.
> > > >
> > > > Yes, the overhead would just be review capacity. In fact, I am not
> > > worrying
> > > > too much about the review capacity. That is just a one time cost. My
> > > > concern is mainly about the long term burden. Assume we have new source
> > > > interface ready in 1.10 with newly added Pulsar connectors in old
> > > > interface. Later on if we migrate Pulsar to new source interface, the
> > old
> > > > Pulsar connector might be deprecated almost immediately after checked
> > in,
> > > > but we may still have to maintain two code bases. For the existing
> > > > connectors, we have to do that anyways. But it would be good to avoid
> > > > introducing a new connector with the same problem.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Tue, Sep 10, 2019 at 6:51 PM Stephan Ewen <se...@apache.org> wrote:
> > > >
> > > > > Hi all!
> > > > >
> > > > > Nice to see this lively discussion about the Pulsar connector.
> > > > > Some thoughts on the open questions:
> > > > >
> > > > > ## Contribute to Flink or maintain as a community package
> > > > >
> > > > > Looks like the discussion is more going towards contribution. I think
> > > > that
> > > > > is good, especially if we think that we want to build a similarly
> > deep
> > > > > integration with Pulsar as we have for example with Kafka. The
> > > connector
> > > > > already looks like a more thorough connector than many others we have
> > > in
> > > > > the repository.
> > > > >
> > > > > With either a repo split, or the new build system, I hope that the
> > > build
> > > > > overhead is not a problem.
> > > > >
> > > > > ## Committer Support
> > > > >
> > > > > Becket offered some help already, I can also help a bit. I hope that
> > > > > between us, we can cover this.
> > > > >
> > > > > ## Contribute now, or wait for FLIP-27
> > > > >
> > > > > As Becket said, FLIP-27 is actually making some PoC-ing progress, but
> > > > will
> > > > > take 2 more months, I would estimate, before it is fully available.
> > > > >
> > > > > If we want to be on the safe side with the contribution, we should
> > > > > contribute the source sooner and adjust it later. That would also
> > help
> > > us
> > > > > in case things get crazy towards the 1.10 feature freeze and it would
> > > be
> > > > > hard to find time to review the new changes.
> > > > > What would be the overhead of contributing now? Given that the code
> > is
> > > > > already there, it looks like it would be only review capacity, right?
> > > > >
> > > > > Best,
> > > > > Stephan
> > > > >
> > > > > On Tue, Sep 10, 2019 at 11:04 AM Yijie Shen <
> > henry.yijies...@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi everyone!
> > > > > >
> > > > > > Thanks for your attention and the promotion of this work.
> > > > > >
> > > > > > We will prepare a FLIP as soon as possible for more specific
> > > > discussions.
> > > > > >
> > > > > > For FLIP-27, it seems that we have not reached a consensus.
> > > Therefore,
> > > > > > I will explain all the functionalities of the existing connector in
> > > > > > the FLIP (including Source, Sink, and Catalog) to continue our
> > > > > > discussions in FLIP.
> > > > > >
> > > > > > Thanks for your kind help.
> > > > > >
> > > > > > Best,
> > > > > > Yijie
> > > > > >
> > > > > > On Tue, Sep 10, 2019 at 9:57 AM Becket Qin <becket....@gmail.com>
> > > > wrote:
> > > > > > >
> > > > > > > Hi Sijie,
> > > > > > >
> > > > > > > If we agree that the goal is to have Pulsar connector in 1.10,
> > how
> > > > > about
> > > > > > we
> > > > > > > do the following:
> > > > > > >
> > > > > > > 0. Start a FLIP to add Pulsar connector to Flink main repo as it
> > > is a
> > > > > new
> > > > > > > public interface to Flink main repo.
> > > > > > > 1. Start to review the Pulsar sink right away as there is no
> > change
> > > > to
> > > > > > the
> > > > > > > sink interface so far.
> > > > > > > 2. Wait a little bit on FLIP-27. Flink 1.10 is going to be code
> > > > freeze
> > > > > in
> > > > > > > late Nov and let's say we give a month to the development and
> > > review
> > > > of
> > > > > > > Pulsar connector, we need to have FLIP-27 by late Oct. There are
> > > > still
> > > > > 7
> > > > > > > weeks. Personally I think it is doable. If FLIP-27 is not ready
> > by
> > > > late
> > > > > > > Oct, we can review and check in Pulsar connector with the
> > existing
> > > > > source
> > > > > > > interface. This means we will have Pulsar connector in Flink
> > 1.10,
> > > > > either
> > > > > > > with or without FLIP-27.
> > > > > > >
> > > > > > > Because we are going to have Pulsar sink and source checked in
> > > > > > separately,
> > > > > > > it might make sense to have two FLIPs, one for Pulsar sink and
> > > > another
> > > > > > for
> > > > > > > Pulsar source. And we can start the work on Pulsar sink right
> > away.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jiangjie (Becket) Qin
> > > > > > >
> > > > > > > On Mon, Sep 9, 2019 at 4:13 PM Sijie Guo <guosi...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Thank you Bowen and Becket.
> > > > > > > >
> > > > > > > > What's the take from Flink community? Shall we wait for FLIP-27
> > > or
> > > > > > shall we
> > > > > > > > proceed to next steps? And what the next steps are? :-)
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Sijie
> > > > > > > >
> > > > > > > > On Thu, Sep 5, 2019 at 2:43 PM Bowen Li <bowenl...@gmail.com>
> > > > wrote:
> > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > I think having a Pulsar connector in Flink can be a good
> > mutual
> > > > > > benefit
> > > > > > > > to
> > > > > > > > > both communities.
> > > > > > > > >
> > > > > > > > > Another perspective is that Pulsar connector is the 1st
> > > streaming
> > > > > > > > connector
> > > > > > > > > that integrates with Flink's metadata management system and
> > > > Catalog
> > > > > > APIs.
> > > > > > > > > It'll be cool to see how the integration turns out and
> > whether
> > > we
> > > > > > need to
> > > > > > > > > improve Flink Catalog stack, which are currently in Beta, to
> > > > cater
> > > > > to
> > > > > > > > > streaming source/sink. Thus I'm in favor of merging Pulsar
> > > > > connector
> > > > > > into
> > > > > > > > > Flink 1.10.
> > > > > > > > >
> > > > > > > > > I'd suggest to submit smaller sized PRs, e.g. maybe one for
> > > basic
> > > > > > > > > source/sink functionalities and another for schema and
> > catalog
> > > > > > > > integration,
> > > > > > > > > just to make them easier to review.
> > > > > > > > >
> > > > > > > > > It doesn't seem to hurt to wait for FLIP-27. But I don't
> > think
> > > > > > FLIP-27
> > > > > > > > > should be a blocker in cases where it cannot make its way
> > into
> > > > 1.10
> > > > > > or
> > > > > > > > > doesn't leave reasonable amount of time for committers to
> > > review
> > > > or
> > > > > > for
> > > > > > > > > Pulsar connector to fully adapt to new interfaces.
> > > > > > > > >
> > > > > > > > > Bowen
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Sep 5, 2019 at 3:21 AM Becket Qin <
> > > becket....@gmail.com>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Till,
> > > > > > > > > >
> > > > > > > > > > You are right. It all depends on when the new source
> > > interface
> > > > is
> > > > > > going
> > > > > > > > > to
> > > > > > > > > > be ready. Personally I think it would be there in about a
> > > month
> > > > > or
> > > > > > so.
> > > > > > > > > But
> > > > > > > > > > I could be too optimistic. It would also be good to hear
> > what
> > > > do
> > > > > > > > Aljoscha
> > > > > > > > > > and Stephan think as they are also involved in FLIP-27.
> > > > > > > > > >
> > > > > > > > > > In general I think we should have Pulsar connector in Flink
> > > > 1.10,
> > > > > > > > > > preferably with the new source interface. We can also check
> > > it
> > > > in
> > > > > > right
> > > > > > > > > now
> > > > > > > > > > with old source interface, but I suspect few users will use
> > > it
> > > > > > before
> > > > > > > > the
> > > > > > > > > > next official release. Therefore, it seems reasonable to
> > > wait a
> > > > > > little
> > > > > > > > > bit
> > > > > > > > > > to see whether we can jump to the new source interface. As
> > > long
> > > > > as
> > > > > > we
> > > > > > > > > make
> > > > > > > > > > sure Flink 1.10 has it, waiting a little bit doesn't seem
> > to
> > > > hurt
> > > > > > much.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > >
> > > > > > > > > > On Thu, Sep 5, 2019 at 3:59 PM Till Rohrmann <
> > > > > trohrm...@apache.org
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi everyone,
> > > > > > > > > > >
> > > > > > > > > > > I'm wondering what the problem would be if we committed
> > the
> > > > > > Pulsar
> > > > > > > > > > > connector before the new source interface is ready. If I
> > > > > > understood
> > > > > > > > it
> > > > > > > > > > > correctly, then we need to support the old source
> > interface
> > > > > > anyway
> > > > > > > > for
> > > > > > > > > > the
> > > > > > > > > > > existing connectors. By checking it in early I could see
> > > the
> > > > > > benefit
> > > > > > > > > that
> > > > > > > > > > > our users could start using the connector earlier.
> > > Moreover,
> > > > it
> > > > > > would
> > > > > > > > > > > prevent that the Pulsar integration is being delayed in
> > > case
> > > > > > that the
> > > > > > > > > > > source interface should be delayed. The only downside I
> > see
> > > > is
> > > > > > the
> > > > > > > > > extra
> > > > > > > > > > > review effort and potential fixes which might be
> > irrelevant
> > > > for
> > > > > > the
> > > > > > > > new
> > > > > > > > > > > source interface implementation. I guess it mainly
> > depends
> > > on
> > > > > how
> > > > > > > > > certain
> > > > > > > > > > > we are when the new source interface will be ready.
> > > > > > > > > > >
> > > > > > > > > > > Cheers,
> > > > > > > > > > > Till
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Sep 5, 2019 at 8:56 AM Becket Qin <
> > > > > becket....@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Sijie and Yijie,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for sharing your thoughts.
> > > > > > > > > > > >
> > > > > > > > > > > > Just want to have some update on FLIP-27. Although the
> > > FLIP
> > > > > > wiki
> > > > > > > > and
> > > > > > > > > > > > discussion thread has been quiet for some time, a few
> > > > > > committer /
> > > > > > > > > > > > contributors in Flink community were actually
> > prototyping
> > > > the
> > > > > > > > entire
> > > > > > > > > > > thing.
> > > > > > > > > > > > We have made some good progress there but want to
> > update
> > > > the
> > > > > > FLIP
> > > > > > > > > wiki
> > > > > > > > > > > > after the entire thing is verified to work in case
> > there
> > > > are
> > > > > > some
> > > > > > > > > last
> > > > > > > > > > > > minute surprise in the implementation. I don't have an
> > > > exact
> > > > > > ETA
> > > > > > > > yet,
> > > > > > > > > > > but I
> > > > > > > > > > > > guess it is going to be within a month or so.
> > > > > > > > > > > >
> > > > > > > > > > > > I am happy to review the current Flink Pulsar connector
> > > and
> > > > > > see if
> > > > > > > > it
> > > > > > > > > > > would
> > > > > > > > > > > > fit in FLIP-27. It would be good to avoid the case that
> > > we
> > > > > > checked
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > Pulsar connector with some review efforts and shortly
> > > after
> > > > > > that
> > > > > > > > the
> > > > > > > > > > new
> > > > > > > > > > > > Source interface is ready.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Sep 5, 2019 at 8:39 AM Yijie Shen <
> > > > > > > > henry.yijies...@gmail.com
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for all the feedback and suggestions!
> > > > > > > > > > > > >
> > > > > > > > > > > > > As Sijie said, the goal of the connector has always
> > > been
> > > > to
> > > > > > > > provide
> > > > > > > > > > > > > users with the latest features of both systems as
> > soon
> > > as
> > > > > > > > possible.
> > > > > > > > > > We
> > > > > > > > > > > > > propose to contribute the connector to Flink and hope
> > > to
> > > > > get
> > > > > > more
> > > > > > > > > > > > > suggestions and feedback from Flink experts to ensure
> > > the
> > > > > > high
> > > > > > > > > > quality
> > > > > > > > > > > > > of the connector.
> > > > > > > > > > > > >
> > > > > > > > > > > > > For FLIP-27, we noticed its existence at the
> > beginning
> > > of
> > > > > > > > reworking
> > > > > > > > > > > > > the connector implementation based on Flink 1.9; we
> > > also
> > > > > > wanted
> > > > > > > > to
> > > > > > > > > > > > > build a connector that supports both batch and stream
> > > > > > computing
> > > > > > > > > based
> > > > > > > > > > > > > on it.
> > > > > > > > > > > > > However, it has been inactive for some time, so we
> > > > decided
> > > > > to
> > > > > > > > > provide
> > > > > > > > > > > > > a connector with most of the new features, such as
> > the
> > > > new
> > > > > > type
> > > > > > > > > > system
> > > > > > > > > > > > > and the new catalog API first. We will pay attention
> > to
> > > > the
> > > > > > > > > progress
> > > > > > > > > > > > > of FLIP-27 continually and incorporate it with the
> > > > > connector
> > > > > > as
> > > > > > > > > soon
> > > > > > > > > > > > > as possible.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regarding the test status of the connector, we are
> > > > > following
> > > > > > the
> > > > > > > > > > other
> > > > > > > > > > > > > connectors' test in Flink repository and aimed to
> > > provide
> > > > > > > > > throughout
> > > > > > > > > > > > > tests as we could. We are also happy to hear
> > > suggestions
> > > > > and
> > > > > > > > > > > > > supervision from the Flink community to improve the
> > > > > > stability and
> > > > > > > > > > > > > performance of the connector continuously.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Yijie
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Thu, Sep 5, 2019 at 5:59 AM Sijie Guo <
> > > > > guosi...@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks everyone for the comments and feedback.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > It seems to me that the main question here is
> > about -
> > > > > "how
> > > > > > can
> > > > > > > > > the
> > > > > > > > > > > > Flink
> > > > > > > > > > > > > > community maintain the connector?".
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Here are two thoughts from myself.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1) I think how and where to host this integration
> > is
> > > > kind
> > > > > > of
> > > > > > > > less
> > > > > > > > > > > > > important
> > > > > > > > > > > > > > here. I believe there can be many ways to achieve
> > it.
> > > > > > > > > > > > > > As part of the contribution, what we are looking
> > for
> > > > here
> > > > > > is
> > > > > > > > how
> > > > > > > > > > > these
> > > > > > > > > > > > > two
> > > > > > > > > > > > > > communities can build the collaboration
> > relationship
> > > on
> > > > > > > > > developing
> > > > > > > > > > > > > > the integration between Pulsar and Flink. Even we
> > can
> > > > try
> > > > > > our
> > > > > > > > > best
> > > > > > > > > > to
> > > > > > > > > > > > > catch
> > > > > > > > > > > > > > up all the updates in Flink community. We are still
> > > > > > > > > > > > > > facing the fact that we have less experiences in
> > > Flink
> > > > > than
> > > > > > > > folks
> > > > > > > > > > in
> > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > community. In order to make sure we maintain and
> > > > deliver
> > > > > > > > > > > > > > a high-quality pulsar-flink integration to the
> > users
> > > > who
> > > > > > use
> > > > > > > > both
> > > > > > > > > > > > > > technologies, we need some help from the experts
> > from
> > > > > Flink
> > > > > > > > > > > community.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2) We have been following FLIP-27 for a while.
> > > > Originally
> > > > > > we
> > > > > > > > were
> > > > > > > > > > > > > thinking
> > > > > > > > > > > > > > of contributing the connectors back after
> > integrating
> > > > > with
> > > > > > the
> > > > > > > > > > > > > > new API introduced in FLIP-27. But we decided to
> > > > initiate
> > > > > > the
> > > > > > > > > > > > > conversation
> > > > > > > > > > > > > > as early as possible. Because we believe there are
> > > more
> > > > > > > > benefits
> > > > > > > > > > > doing
> > > > > > > > > > > > > > it now rather than later. As part of contribution,
> > it
> > > > can
> > > > > > help
> > > > > > > > > > Flink
> > > > > > > > > > > > > > community understand more about Pulsar and the
> > > > potential
> > > > > > > > > > integration
> > > > > > > > > > > > > points.
> > > > > > > > > > > > > > Also we can also help Flink community verify the
> > new
> > > > > > connector
> > > > > > > > > API
> > > > > > > > > > as
> > > > > > > > > > > > > well
> > > > > > > > > > > > > > as other new API (e.g. catalog API).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > Sijie
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Sep 4, 2019 at 5:24 AM Becket Qin <
> > > > > > > > becket....@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Yijie,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for the interest in contributing the
> > Pulsar
> > > > > > connector.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > In general, I think having Pulsar connector with
> > > > strong
> > > > > > > > support
> > > > > > > > > > is
> > > > > > > > > > > a
> > > > > > > > > > > > > > > valuable addition to Flink. So I am happy the
> > > > shepherd
> > > > > > this
> > > > > > > > > > effort.
> > > > > > > > > > > > > > > Meanwhile, I would also like to provide some
> > > context
> > > > > and
> > > > > > > > recent
> > > > > > > > > > > > > efforts on
> > > > > > > > > > > > > > > the Flink connectors ecosystem.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The current way Flink maintains its connector has
> > > hit
> > > > > the
> > > > > > > > > > > scalability
> > > > > > > > > > > > > bar.
> > > > > > > > > > > > > > > With more and more connectors coming into Flink
> > > repo,
> > > > > we
> > > > > > are
> > > > > > > > > > > facing a
> > > > > > > > > > > > > few
> > > > > > > > > > > > > > > problems such as long build and testing time. To
> > > > > address
> > > > > > this
> > > > > > > > > > > > problem,
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > have attempted to do the following:
> > > > > > > > > > > > > > > 1. Split out the connectors into a separate
> > > > repository.
> > > > > > This
> > > > > > > > is
> > > > > > > > > > > > > temporarily
> > > > > > > > > > > > > > > on hold due to potential solution to shorten the
> > > > build
> > > > > > time.
> > > > > > > > > > > > > > > 2. Encourage the connectors to stay as ecosystem
> > > > > project
> > > > > > > > while
> > > > > > > > > > > Flink
> > > > > > > > > > > > > tries
> > > > > > > > > > > > > > > to provide good support for functionality and
> > > > > > compatibility
> > > > > > > > > > tests.
> > > > > > > > > > > > > Robert
> > > > > > > > > > > > > > > has driven to create a Flink Ecosystem project
> > > > website
> > > > > > and it
> > > > > > > > > is
> > > > > > > > > > > > going
> > > > > > > > > > > > > > > through some final approval process.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Given the above efforts, it would be great to
> > first
> > > > see
> > > > > > if we
> > > > > > > > > can
> > > > > > > > > > > > have
> > > > > > > > > > > > > > > Pulsar connector as an ecosystem project with
> > great
> > > > > > support.
> > > > > > > > It
> > > > > > > > > > > would
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > good to hear how the Flink Pulsar connector is
> > > tested
> > > > > > > > currently
> > > > > > > > > > to
> > > > > > > > > > > > see
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > we can learn something to maintain it as an
> > > ecosystem
> > > > > > project
> > > > > > > > > > with
> > > > > > > > > > > > good
> > > > > > > > > > > > > > > quality and test coverage. If the quality as an
> > > > > ecosystem
> > > > > > > > > project
> > > > > > > > > > > is
> > > > > > > > > > > > > hard
> > > > > > > > > > > > > > > to guarantee, we may as well adopt it into the
> > main
> > > > > repo.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > BTW, another ongoing effort is FLIP-27 where we
> > are
> > > > > > making
> > > > > > > > > > changes
> > > > > > > > > > > to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > Flink source connector architecture and
> > interface.
> > > > This
> > > > > > > > change
> > > > > > > > > > will
> > > > > > > > > > > > > likely
> > > > > > > > > > > > > > > land in 1.10. Therefore timing wise, if we are
> > > going
> > > > to
> > > > > > have
> > > > > > > > > the
> > > > > > > > > > > > Pulsar
> > > > > > > > > > > > > > > connector in main repo, I am wondering if we
> > should
> > > > > hold
> > > > > > a
> > > > > > > > > little
> > > > > > > > > > > bit
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > let the Pulsar connector adapt to the new
> > interface
> > > > to
> > > > > > avoid
> > > > > > > > > > > shortly
> > > > > > > > > > > > > > > deprecated work?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Sep 4, 2019 at 4:32 PM Chesnay Schepler <
> > > > > > > > > > > ches...@apache.org>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I'm quite worried that we may end up repeating
> > > > > history.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > There were already 2 attempts at contributing a
> > > > > pulsar
> > > > > > > > > > connector,
> > > > > > > > > > > > > both
> > > > > > > > > > > > > > > > of which failed because no committer was
> > getting
> > > > > > involved,
> > > > > > > > > > > despite
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > contributor opening a dedicated discussion
> > thread
> > > > > > about the
> > > > > > > > > > > > > contribution
> > > > > > > > > > > > > > > > beforehand and getting several +1's from
> > > > committers.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > We should really make sure that if we
> > > > welcome/approve
> > > > > > such
> > > > > > > > a
> > > > > > > > > > > > > > > > contribution it will actually get the attention
> > > it
> > > > > > > > deserves.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > As such, I'm inclined to recommend maintaining
> > > the
> > > > > > > > connector
> > > > > > > > > > > > outside
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > Flink. We could link to it from the
> > documentation
> > > > to
> > > > > > give
> > > > > > > > it
> > > > > > > > > > more
> > > > > > > > > > > > > > > exposure.
> > > > > > > > > > > > > > > > With the upcoming page for sharing artifacts
> > > among
> > > > > the
> > > > > > > > > > community
> > > > > > > > > > > > > (what's
> > > > > > > > > > > > > > > > the state of that anyway?), this may be a
> > better
> > > > > > option.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On 04/09/2019 10:16, Till Rohrmann wrote:
> > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > thanks a lot for starting this discussion
> > > Yijie.
> > > > I
> > > > > > think
> > > > > > > > > the
> > > > > > > > > > > > Pulsar
> > > > > > > > > > > > > > > > > connector would be a very valuable addition
> > > since
> > > > > > Pulsar
> > > > > > > > > > > becomes
> > > > > > > > > > > > > more
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > more popular and it would further expand
> > > Flink's
> > > > > > > > > > > > interoperability.
> > > > > > > > > > > > > Also
> > > > > > > > > > > > > > > > > from a project perspective it makes sense for
> > > me
> > > > to
> > > > > > place
> > > > > > > > > the
> > > > > > > > > > > > > connector
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > the downstream project.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > My main concern/question is how can the Flink
> > > > > > community
> > > > > > > > > > > maintain
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > connector? We have seen in the past that
> > > > connectors
> > > > > > are
> > > > > > > > > some
> > > > > > > > > > of
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > most
> > > > > > > > > > > > > > > > > actively developed components because they
> > need
> > > > to
> > > > > be
> > > > > > > > kept
> > > > > > > > > in
> > > > > > > > > > > > sync
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > external system and with Flink. Given that
> > the
> > > > > Pulsar
> > > > > > > > > > community
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > willing
> > > > > > > > > > > > > > > > > to help with maintaining, improving and
> > > evolving
> > > > > the
> > > > > > > > > > connector,
> > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > > > optimistic that we can achieve this. Hence,
> > +1
> > > > for
> > > > > > > > > > contributing
> > > > > > > > > > > > it
> > > > > > > > > > > > > back
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > Flink.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > Till
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Wed, Sep 4, 2019 at 2:03 AM Sijie Guo <
> > > > > > > > > guosi...@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> Hi Yun,
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> Since I was the main driver behind
> > FLINK-9641
> > > > and
> > > > > > > > > > FLINK-9168,
> > > > > > > > > > > > let
> > > > > > > > > > > > > me
> > > > > > > > > > > > > > > > try to
> > > > > > > > > > > > > > > > >> add more context on this.
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> FLINK-9641 and FLINK-9168 was created for
> > > > bringing
> > > > > > > > Pulsar
> > > > > > > > > as
> > > > > > > > > > > > > source
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > >> sink for Flink. The integration was done
> > with
> > > > > Flink
> > > > > > > > 1.6.0.
> > > > > > > > > > We
> > > > > > > > > > > > > sent out
> > > > > > > > > > > > > > > > pull
> > > > > > > > > > > > > > > > >> requests about a year ago and we ended up
> > > > > > maintaining
> > > > > > > > > those
> > > > > > > > > > > > > connectors
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > >> Pulsar for Pulsar users to use Flink to
> > > process
> > > > > > event
> > > > > > > > > > streams
> > > > > > > > > > > in
> > > > > > > > > > > > > > > Pulsar.
> > > > > > > > > > > > > > > > >> (See
> > > > > > > > > > >
> > https://github.com/apache/pulsar/tree/master/pulsar-flink
> > > > > > > > > > > > ).
> > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > >> 1.6 integration is pretty simple and there
> > is
> > > no
> > > > > > schema
> > > > > > > > > > > > > > > considerations.
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> In the past year, we have made a lot of
> > > changes
> > > > in
> > > > > > > > Pulsar
> > > > > > > > > > and
> > > > > > > > > > > > > brought
> > > > > > > > > > > > > > > > >> Pulsar schema as the first-class citizen in
> > > > > Pulsar.
> > > > > > We
> > > > > > > > > also
> > > > > > > > > > > > > integrated
> > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > >> other computing engines for processing
> > Pulsar
> > > > > event
> > > > > > > > > streams
> > > > > > > > > > > with
> > > > > > > > > > > > > > > Pulsar
> > > > > > > > > > > > > > > > >> schema.
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> It led us to rethink how to integrate with
> > > Flink
> > > > > in
> > > > > > the
> > > > > > > > > best
> > > > > > > > > > > > way.
> > > > > > > > > > > > > Then
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > >> reimplement the pulsar-flink connectors from
> > > the
> > > > > > ground
> > > > > > > > up
> > > > > > > > > > > with
> > > > > > > > > > > > > schema
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > >> bring table API and catalog API as the
> > > > first-class
> > > > > > > > citizen
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> integration. With that being said, in the
> > new
> > > > > > > > pulsar-flink
> > > > > > > > > > > > > > > > implementation,
> > > > > > > > > > > > > > > > >> you can register pulsar as a flink catalog
> > and
> > > > > > query /
> > > > > > > > > > process
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > event
> > > > > > > > > > > > > > > > >> streams using Flink SQL.
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> This is an example about how to use Pulsar
> > as
> > > a
> > > > > > Flink
> > > > > > > > > > catalog:
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://github.com/streamnative/pulsar-flink/blob/3eeddec5625fc7dddc3f8a3ec69f72e1614ca9c9/README.md#use-pulsar-catalog
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> Yijie has also written a blog post
> > explaining
> > > > why
> > > > > we
> > > > > > > > > > > > re-implement
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > flink
> > > > > > > > > > > > > > > > >> connector with Flink 1.9 and what are the
> > > > changes
> > > > > we
> > > > > > > > made
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > >> connector:
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://medium.com/streamnative/use-apache-pulsar-as-streaming-table-with-8-lines-of-code-39033a93947f
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> We believe Pulsar is not just a simple data
> > > sink
> > > > > or
> > > > > > > > source
> > > > > > > > > > for
> > > > > > > > > > > > > Flink.
> > > > > > > > > > > > > > > It
> > > > > > > > > > > > > > > > >> actually can be a fully integrated streaming
> > > > data
> > > > > > > > storage
> > > > > > > > > > for
> > > > > > > > > > > > > Flink in
> > > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > >> areas (sink, source, schema/catalog and
> > > state).
> > > > > The
> > > > > > > > > > > combination
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > >> and Pulsar can create a great streaming
> > > > warehouse
> > > > > > > > > > architecture
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > >> streaming-first, unified data processing.
> > > Since
> > > > we
> > > > > > are
> > > > > > > > > > talking
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> contribute Pulsar integration to Flink here,
> > > we
> > > > > are
> > > > > > also
> > > > > > > > > > > > > dedicated to
> > > > > > > > > > > > > > > > >> maintain, improve and evolve the integration
> > > > with
> > > > > > Flink
> > > > > > > > to
> > > > > > > > > > > help
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > >> who use both Flink and Pulsar.
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> Hope this give you a bit more background
> > about
> > > > the
> > > > > > > > pulsar
> > > > > > > > > > > flink
> > > > > > > > > > > > > > > > >> integration. Let me know what are your
> > > thoughts.
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> Thanks,
> > > > > > > > > > > > > > > > >> Sijie
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> On Tue, Sep 3, 2019 at 11:54 AM Yun Tang <
> > > > > > > > > myas...@live.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >>> Hi Yijie
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> I can see that Pulsar becomes more and more
> > > > > popular
> > > > > > > > > > recently
> > > > > > > > > > > > and
> > > > > > > > > > > > > very
> > > > > > > > > > > > > > > > >> glad
> > > > > > > > > > > > > > > > >>> to see more people willing to contribute to
> > > > Flink
> > > > > > > > > > ecosystem.
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> Before any further discussion, would you
> > > please
> > > > > > give
> > > > > > > > some
> > > > > > > > > > > > > explanation
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > >>> the relationship between this thread to
> > > current
> > > > > > > > existing
> > > > > > > > > > > JIRAs
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > pulsar
> > > > > > > > > > > > > > > > >>> source [1] and sink [2] connector? Will the
> > > > > > > > contribution
> > > > > > > > > > > > contains
> > > > > > > > > > > > > > > part
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > >>> those PRs or totally different
> > > implementation?
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> [1]
> > > > > > https://issues.apache.org/jira/browse/FLINK-9641
> > > > > > > > > > > > > > > > >>> [2]
> > > > > > https://issues.apache.org/jira/browse/FLINK-9168
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> Best
> > > > > > > > > > > > > > > > >>> Yun Tang
> > > > > > > > > > > > > > > > >>> ________________________________
> > > > > > > > > > > > > > > > >>> From: Yijie Shen <
> > henry.yijies...@gmail.com>
> > > > > > > > > > > > > > > > >>> Sent: Tuesday, September 3, 2019 13:57
> > > > > > > > > > > > > > > > >>> To: dev@flink.apache.org <
> > > dev@flink.apache.org
> > > > >
> > > > > > > > > > > > > > > > >>> Subject: [DISCUSS] Contribute Pulsar Flink
> > > > > > connector
> > > > > > > > back
> > > > > > > > > > to
> > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> Dear Flink Community!
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> I would like to open the discussion of
> > > > > contributing
> > > > > > > > > Pulsar
> > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > >>> connector [0] back to Flink.
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> ## A brief introduction to Apache Pulsar
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> Apache Pulsar[1] is a multi-tenant,
> > > > > > high-performance
> > > > > > > > > > > > distributed
> > > > > > > > > > > > > > > > >>> pub-sub messaging system. Pulsar includes
> > > > > multiple
> > > > > > > > > features
> > > > > > > > > > > > such
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > >>> native support for multiple clusters in a
> > > > Pulsar
> > > > > > > > > instance,
> > > > > > > > > > > with
> > > > > > > > > > > > > > > > >>> seamless geo-replication of messages across
> > > > > > clusters,
> > > > > > > > > very
> > > > > > > > > > > low
> > > > > > > > > > > > > > > publish
> > > > > > > > > > > > > > > > >>> and end-to-end latency, seamless
> > scalability
> > > to
> > > > > > over a
> > > > > > > > > > > million
> > > > > > > > > > > > > > > topics,
> > > > > > > > > > > > > > > > >>> and guaranteed message delivery with
> > > persistent
> > > > > > message
> > > > > > > > > > > storage
> > > > > > > > > > > > > > > > >>> provided by Apache BookKeeper. Nowadays,
> > > Pulsar
> > > > > has
> > > > > > > > been
> > > > > > > > > > > > adopted
> > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > >>> more and more companies[2].
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> ## The status of Pulsar Flink connector
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> The Pulsar Flink connector we are planning
> > to
> > > > > > > > contribute
> > > > > > > > > is
> > > > > > > > > > > > built
> > > > > > > > > > > > > > > upon
> > > > > > > > > > > > > > > > >>> Flink 1.9.0 and Pulsar 2.4.0. The main
> > > features
> > > > > > are:
> > > > > > > > > > > > > > > > >>> - Pulsar as a streaming source with
> > > > exactly-once
> > > > > > > > > guarantee.
> > > > > > > > > > > > > > > > >>> - Sink streaming results to Pulsar with
> > > > > > at-least-once
> > > > > > > > > > > > semantics.
> > > > > > > > > > > > > (We
> > > > > > > > > > > > > > > > >>> would update this to exactly-once as well
> > > when
> > > > > > Pulsar
> > > > > > > > > gets
> > > > > > > > > > > all
> > > > > > > > > > > > > > > > >>> transaction features ready in its 2.5.0
> > > > version)
> > > > > > > > > > > > > > > > >>> - Build upon Flink new Table API Type
> > system
> > > > > > > > > (FLIP-37[3]),
> > > > > > > > > > > and
> > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > >>> automatically (de)serialize messages with
> > the
> > > > > help
> > > > > > of
> > > > > > > > > > Pulsar
> > > > > > > > > > > > > schema.
> > > > > > > > > > > > > > > > >>> - Integrate with Flink new Catalog API
> > > > > > (FLIP-30[4]),
> > > > > > > > > which
> > > > > > > > > > > > > enables
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >>> use of Pulsar topics as tables in Table API
> > > as
> > > > > > well as
> > > > > > > > > SQL
> > > > > > > > > > > > > client.
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> ## Reference
> > > > > > > > > > > > > > > > >>> [0]
> > > > https://github.com/streamnative/pulsar-flink
> > > > > > > > > > > > > > > > >>> [1] https://pulsar.apache.org/
> > > > > > > > > > > > > > > > >>> [2]
> > https://pulsar.apache.org/en/powered-by/
> > > > > > > > > > > > > > > > >>> [3]
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System
> > > > > > > > > > > > > > > > >>> [4]
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> Best,
> > > > > > > > > > > > > > > > >>> Yijie Shen
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >

Reply via email to