Really excited to see Blink joining the Flink community!

My two cents regarding repo v.s. branch, I am +1 for a branch in Flink.
Among many things, what's most important at this point is probably to make
Blink code available to the developers so people can discuss the merge
strategy. Creating a branch is probably the one of the fastest way to do
that. We can always create separate repo later if necessary.

WRT the doc and jar distribution, It is true that we are going to have some
major refactoring to the code. But I can imagine some curious users may
still want to try out something in Blink and it would be good if we can do
them a favor. Legal wise, my hunch is that it is probably OK for someone to
just build the jars and docs, host it somewhere for convenience. But it
should be clear that this is just for convenience purpose instead of an
official release form Apache (unless we would like to make it official).

Thanks,

Jiangjie (Becket) Qin

On Wed, Jan 23, 2019 at 6:48 PM Chesnay Schepler <ches...@apache.org> wrote:

>  From the ASF side Jar files do notrequire a vote/release process, this
> is at the discretion of the PMC.
>
> However, I have my doubts whether at this time we could even create a
> source release of Blink given that we'd have to vet the code-base first.
>
> Even without source release we could still distribute jars, but would
> not be allowed to advertise them to users as they do not constitute an
> official release.
>
> On 23.01.2019 11:41, Timo Walther wrote:
> > As far as I know it, we will not provide any binaries but only the
> > source code. JAR files on Apache servers would need an official
> > voting/release process. Interested users can build Blink themselves
> > using `mvn clean package`.
> >
> > @Stephan: Please correct me if I'm wrong.
> >
> > Regards,
> > Timo
> >
> > Am 23.01.19 um 11:16 schrieb Kurt Young:
> >> Hi Timo,
> >>
> >> What about the jar files, will blink's jar be uploaded to apache
> >> repository? If not, i think it will be very inconvenient for users who
> >> wants to try blink and view the documents if they need some help from
> >> doc.
> >>
> >> Best,
> >> Kurt
> >>
> >>
> >> On Wed, Jan 23, 2019 at 6:09 PM Timo Walther <twal...@apache.org>
> wrote:
> >>
> >>> Hi Kurt,
> >>>
> >>> I would not make the Blink's documentation visible to users or search
> >>> engines via a website. Otherwise this would communicate that Blink
> >>> is an
> >>> official release. I would suggest to put the Blink docs into `/docs`
> >>> and
> >>> people can build it with `./docs/build.sh -pi` if there are interested.
> >>> I would not invest time into setting up a docs infrastructure.
> >>>
> >>> Regards,
> >>> Timo
> >>>
> >>> Am 23.01.19 um 08:56 schrieb Kurt Young:
> >>>> Thanks @Stephan for this exciting announcement!
> >>>>
> >>>> >From my point of view, i would prefer to use branch. It makes the
> >>> message
> >>>> "Blink is pat of Flink" more straightforward and clear.
> >>>>
> >>>> Except for the location of blink codes, there are some other questions
> >>> like
> >>>> what version should should use, and where do we put blink's documents.
> >>>> Currently, we choose to use "1.5.1-blink-r0" as blink's version since
> >>> blink
> >>>> forked from Flink's 1.5.1. We also added some docs to blink just as
> >>>> Flink
> >>>> did. Can blink use a website like
> >>>> "https://ci.apache.org/projects/flink/flink-docs-release-1.7/"; to put
> >>> all
> >>>> blink's docs, change it to something like
> >>>> https://ci.apache.org/projects/flink/flink-docs-blink-r0/ ?
> >>>>
> >>>> Best,
> >>>> Kurt
> >>>>
> >>>>
> >>>> On Wed, Jan 23, 2019 at 10:55 AM Hequn Cheng <chenghe...@gmail.com>
> >>> wrote:
> >>>>> Hi all,
> >>>>>
> >>>>> @Stephan  Thanks a lot for driving these efforts. I think a lot of
> >>> people
> >>>>> is already waiting for this.
> >>>>> +1 for opening the blink source code.
> >>>>> Both a separate repository or a special branch is ok for me.
> >>>>> Hopefully,
> >>>>> this will not last too long.
> >>>>>
> >>>>> Best, Hequn
> >>>>>
> >>>>>
> >>>>> On Tue, Jan 22, 2019 at 11:35 PM Jark Wu <imj...@gmail.com> wrote:
> >>>>>
> >>>>>> Great news! Looking forward to the new wave of developments.
> >>>>>>
> >>>>>> If Blink needs to be continuously updated, fix bugs, release
> >>>>>> versions,
> >>>>>> maybe a separate repository is a better idea.
> >>>>>>
> >>>>>> Best,
> >>>>>> Jark
> >>>>>>
> >>>>>> On Tue, 22 Jan 2019 at 18:29, Dominik Wosiński <wos...@gmail.com>
> >>> wrote:
> >>>>>>> Hey!
> >>>>>>> I also think that creating the separate branch for Blink in
> >>>>>>> Flink repo
> >>>>>> is a
> >>>>>>> better idea than creating the fork as IMHO it will allow merging
> >>>>> changes
> >>>>>>> more easily.
> >>>>>>>
> >>>>>>> Best Regards,
> >>>>>>> Dom.
> >>>>>>>
> >>>>>>> wt., 22 sty 2019 o 10:09 Ufuk Celebi <u...@apache.org> napisał(a):
> >>>>>>>
> >>>>>>>> Hey Stephan and others,
> >>>>>>>>
> >>>>>>>> thanks for the summary. I'm very excited about the outlined
> >>>>>> improvements.
> >>>>>>>> :-)
> >>>>>>>>
> >>>>>>>> Separate branch vs. fork: I'm fine with either of the suggestions.
> >>>>>>>> Depending on the expected strategy for merging the changes,
> >>>>>>>> expected
> >>>>>>>> number of additional changes, etc., either one or the other
> >>>>>>>> approach
> >>>>>>>> might be better suited.
> >>>>>>>>
> >>>>>>>> – Ufuk
> >>>>>>>>
> >>>>>>>> On Tue, Jan 22, 2019 at 9:20 AM Kurt Young <ykt...@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>> Hi Driesprong,
> >>>>>>>>>
> >>>>>>>>> Glad to hear that you're interested with blink's codes. Actually,
> >>>>>> blink
> >>>>>>>>> only has one branch by itself, so either a separated repo or a
> >>>>>> flink's
> >>>>>>>>> branch works for blink's code share.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Kurt
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Tue, Jan 22, 2019 at 2:30 PM Driesprong, Fokko
> >>>>>> <fo...@driesprong.frl
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Great news Stephan!
> >>>>>>>>>>
> >>>>>>>>>> Why not make the code available by having a fork of Flink on
> >>>>>>> Alibaba's
> >>>>>>>>>> Github account. This will allow us to do easy diff's in the
> >>>>> Github
> >>>>>> UI
> >>>>>>>> and
> >>>>>>>>>> create PR's of cherry-picked commits if needed. I can imagine
> >>>>> that
> >>>>>>> the
> >>>>>>>>>> Blink codebase has a lot of branches by itself, so just
> >>>>>>>>>> pushing a
> >>>>>>>> couple of
> >>>>>>>>>> branches to the main Flink repo is not ideal. Looking forward to
> >>>>>> it!
> >>>>>>>>>> Cheers, Fokko
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Op di 22 jan. 2019 om 03:48 schreef Shaoxuan Wang <
> >>>>>>> wshaox...@gmail.com
> >>>>>>>>> :
> >>>>>>>>>>> big +1 to contribute Blink codebase directly into the Apache
> >>>>>> Flink
> >>>>>>>>>> project.
> >>>>>>>>>>> Looking forward to the new journey.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Shaoxuan
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Jan 22, 2019 at 3:52 AM Xiaowei Jiang <
> >>>>>> xiaow...@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>    Thanks Stephan! We are hoping to make the process as
> >>>>>>>> non-disruptive as
> >>>>>>>>>>>> possible to the Flink community. Making the Blink codebase
> >>>>>> public
> >>>>>>>> is
> >>>>>>>>>> the
> >>>>>>>>>>>> first step that hopefully facilitates further discussions.
> >>>>>>>>>>>> Xiaowei
> >>>>>>>>>>>>
> >>>>>>>>>>>>       On Monday, January 21, 2019, 11:46:28 AM PST, Stephan
> >>>>> Ewen
> >>>>>> <
> >>>>>>>>>>>> se...@apache.org> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>    Dear Flink Community!
> >>>>>>>>>>>>
> >>>>>>>>>>>> Some of you may have heard it already from announcements or
> >>>>>> from
> >>>>>>> a
> >>>>>>>>>> Flink
> >>>>>>>>>>>> Forward talk:
> >>>>>>>>>>>> Alibaba has decided to open source its in-house improvements
> >>>>> to
> >>>>>>>> Flink,
> >>>>>>>>>>>> called Blink!
> >>>>>>>>>>>> First of all, big thanks to team that developed these
> >>>>>>> improvements
> >>>>>>>> and
> >>>>>>>>>>> made
> >>>>>>>>>>>> this
> >>>>>>>>>>>> contribution possible!
> >>>>>>>>>>>>
> >>>>>>>>>>>> Blink has some very exciting enhancements, most prominently
> >>>>> on
> >>>>>>> the
> >>>>>>>>>> Table
> >>>>>>>>>>>> API/SQL side
> >>>>>>>>>>>> and the unified execution of these programs. For batch
> >>>>>> (bounded)
> >>>>>>>> data,
> >>>>>>>>>>> the
> >>>>>>>>>>>> SQL execution
> >>>>>>>>>>>> has full TPC-DS coverage (which is a big deal), and the
> >>>>>> execution
> >>>>>>>> is
> >>>>>>>>>> more
> >>>>>>>>>>>> than 10x faster
> >>>>>>>>>>>> than the current SQL runtime in Flink. Blink has also added
> >>>>>>>> support for
> >>>>>>>>>>>> catalogs,
> >>>>>>>>>>>> improved the failover speed of batch queries and the resource
> >>>>>>>>>> management.
> >>>>>>>>>>>> It also
> >>>>>>>>>>>> makes some good steps in the direction of more deeply
> >>>>> unifying
> >>>>>>> the
> >>>>>>>>>> batch
> >>>>>>>>>>>> and streaming
> >>>>>>>>>>>> execution.
> >>>>>>>>>>>>
> >>>>>>>>>>>> The proposal is to merge Blink's enhancements into Flink, to
> >>>>>> give
> >>>>>>>>>> Flink's
> >>>>>>>>>>>> SQL/Table API and
> >>>>>>>>>>>> execution a big boost in usability and performance.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Just to avoid any confusion: This is not a suggested change
> >>>>> of
> >>>>>>>> focus to
> >>>>>>>>>>>> batch processing,
> >>>>>>>>>>>> nor would this break with any of the streaming architecture
> >>>>> and
> >>>>>>>> vision
> >>>>>>>>>> of
> >>>>>>>>>>>> Flink.
> >>>>>>>>>>>> This contribution follows very much the principle of "batch
> >>>>> is
> >>>>>> a
> >>>>>>>>>> special
> >>>>>>>>>>>> case of streaming".
> >>>>>>>>>>>> As a special case, batch makes special optimizations
> >>>>> possible.
> >>>>>> In
> >>>>>>>> its
> >>>>>>>>>>>> current state,
> >>>>>>>>>>>> Flink does not exploit many of these optimizations. This
> >>>>>>>> contribution
> >>>>>>>>>>> adds
> >>>>>>>>>>>> exactly these
> >>>>>>>>>>>> optimizations and makes the streaming model of Flink
> >>>>> applicable
> >>>>>>> to
> >>>>>>>>>> harder
> >>>>>>>>>>>> batch use cases.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Assuming that the community is excited about this as well,
> >>>>> and
> >>>>>> in
> >>>>>>>> favor
> >>>>>>>>>>> of
> >>>>>>>>>>>> these enhancements
> >>>>>>>>>>>> to Flink's capabilities, below are some thoughts on how this
> >>>>>>>>>> contribution
> >>>>>>>>>>>> and integration
> >>>>>>>>>>>> could work.
> >>>>>>>>>>>>
> >>>>>>>>>>>> --- Making the code available ---
> >>>>>>>>>>>>
> >>>>>>>>>>>> At the moment, the Blink code is in the form of a big Flink
> >>>>>> fork
> >>>>>>>>>> (rather
> >>>>>>>>>>>> than isolated
> >>>>>>>>>>>> patches on top of Flink), so the integration is unfortunately
> >>>>>> not
> >>>>>>>> as
> >>>>>>>>>> easy
> >>>>>>>>>>>> as merging a
> >>>>>>>>>>>> few patches or pull requests.
> >>>>>>>>>>>>
> >>>>>>>>>>>> To support a non-disruptive merge of such a big
> >>>>> contribution, I
> >>>>>>>> believe
> >>>>>>>>>>> it
> >>>>>>>>>>>> make sense to make
> >>>>>>>>>>>> the code of the fork available in the Flink project first.
> >>>>>>>>>>>>   From there on, we can start to work on the details for
> >>>>> merging
> >>>>>>> the
> >>>>>>>>>>>> enhancements, including
> >>>>>>>>>>>> the refactoring of the necessary parts in the Flink master
> >>>>> and
> >>>>>>> the
> >>>>>>>>>> Blink
> >>>>>>>>>>>> code to make a
> >>>>>>>>>>>> merge possible without repeatedly breaking compatibility.
> >>>>>>>>>>>>
> >>>>>>>>>>>> The first question is where do we put the code of the Blink
> >>>>>> fork
> >>>>>>>> during
> >>>>>>>>>>> the
> >>>>>>>>>>>> merging procedure?
> >>>>>>>>>>>> My first thought was to temporarily add a repository (like
> >>>>>>>>>>>> "flink-blink-staging"), but we could
> >>>>>>>>>>>> also put it into a special branch in the main Flink
> >>>>> repository.
> >>>>>>>>>>>> I will start a separate thread about discussing a possible
> >>>>>>>> strategy to
> >>>>>>>>>>>> handle and merge
> >>>>>>>>>>>> such a big contribution.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best,
> >>>>>>>>>>>> Stephan
> >>>>>>>>>>>>
> >>>
> >
> >
>
>

Reply via email to