Thanks Stephan! This is a very exciting news for the flink community. I recommend creating a branch for blink in the Flink repository. Just like feature development, the blink branch is a branch with many enhancements, and the enhanced functionality is continuously merged to the flink master.
Cheers, Jincheng Timo Walther <twal...@apache.org> 于2019年1月22日周二 下午4:45写道: > Thanks for driving these efforts, Stephan! Great news that the Blink > code base will be available for everyone soon. I already got access to > it and the added functionality and improved architecture is impressive. > There will be nice additions to Flink. > > I guess the Blink code base will be continuously updated while the Flink > community merged chunks of it, right? If yes, I would also be in favor > of a separate repository similar to flink-shaded. > > Regards, > Timo > > > Am 22.01.19 um 09:20 schrieb Kurt Young: > > Hi Driesprong, > > > > Glad to hear that you're interested with blink's codes. Actually, blink > > only has one branch by itself, so either a separated repo or a flink's > > branch works for blink's code share. > > > > Best, > > Kurt > > > > > > On Tue, Jan 22, 2019 at 2:30 PM Driesprong, Fokko <fo...@driesprong.frl> > > wrote: > > > >> Great news Stephan! > >> > >> Why not make the code available by having a fork of Flink on Alibaba's > >> Github account. This will allow us to do easy diff's in the Github UI > and > >> create PR's of cherry-picked commits if needed. I can imagine that the > >> Blink codebase has a lot of branches by itself, so just pushing a > couple of > >> branches to the main Flink repo is not ideal. Looking forward to it! > >> > >> Cheers, Fokko > >> > >> > >> > >> > >> > >> Op di 22 jan. 2019 om 03:48 schreef Shaoxuan Wang <wshaox...@gmail.com > >: > >> > >>> big +1 to contribute Blink codebase directly into the Apache Flink > >> project. > >>> Looking forward to the new journey. > >>> > >>> Regards, > >>> Shaoxuan > >>> > >>> On Tue, Jan 22, 2019 at 3:52 AM Xiaowei Jiang <xiaow...@gmail.com> > >> wrote: > >>>> Thanks Stephan! We are hoping to make the process as non-disruptive > as > >>>> possible to the Flink community. Making the Blink codebase public is > >> the > >>>> first step that hopefully facilitates further discussions. > >>>> Xiaowei > >>>> > >>>> On Monday, January 21, 2019, 11:46:28 AM PST, Stephan Ewen < > >>>> se...@apache.org> wrote: > >>>> > >>>> Dear Flink Community! > >>>> > >>>> Some of you may have heard it already from announcements or from a > >> Flink > >>>> Forward talk: > >>>> Alibaba has decided to open source its in-house improvements to Flink, > >>>> called Blink! > >>>> First of all, big thanks to team that developed these improvements and > >>> made > >>>> this > >>>> contribution possible! > >>>> > >>>> Blink has some very exciting enhancements, most prominently on the > >> Table > >>>> API/SQL side > >>>> and the unified execution of these programs. For batch (bounded) data, > >>> the > >>>> SQL execution > >>>> has full TPC-DS coverage (which is a big deal), and the execution is > >> more > >>>> than 10x faster > >>>> than the current SQL runtime in Flink. Blink has also added support > for > >>>> catalogs, > >>>> improved the failover speed of batch queries and the resource > >> management. > >>>> It also > >>>> makes some good steps in the direction of more deeply unifying the > >> batch > >>>> and streaming > >>>> execution. > >>>> > >>>> The proposal is to merge Blink's enhancements into Flink, to give > >> Flink's > >>>> SQL/Table API and > >>>> execution a big boost in usability and performance. > >>>> > >>>> Just to avoid any confusion: This is not a suggested change of focus > to > >>>> batch processing, > >>>> nor would this break with any of the streaming architecture and vision > >> of > >>>> Flink. > >>>> This contribution follows very much the principle of "batch is a > >> special > >>>> case of streaming". > >>>> As a special case, batch makes special optimizations possible. In its > >>>> current state, > >>>> Flink does not exploit many of these optimizations. This contribution > >>> adds > >>>> exactly these > >>>> optimizations and makes the streaming model of Flink applicable to > >> harder > >>>> batch use cases. > >>>> > >>>> Assuming that the community is excited about this as well, and in > favor > >>> of > >>>> these enhancements > >>>> to Flink's capabilities, below are some thoughts on how this > >> contribution > >>>> and integration > >>>> could work. > >>>> > >>>> --- Making the code available --- > >>>> > >>>> At the moment, the Blink code is in the form of a big Flink fork > >> (rather > >>>> than isolated > >>>> patches on top of Flink), so the integration is unfortunately not as > >> easy > >>>> as merging a > >>>> few patches or pull requests. > >>>> > >>>> To support a non-disruptive merge of such a big contribution, I > believe > >>> it > >>>> make sense to make > >>>> the code of the fork available in the Flink project first. > >>>> From there on, we can start to work on the details for merging the > >>>> enhancements, including > >>>> the refactoring of the necessary parts in the Flink master and the > >> Blink > >>>> code to make a > >>>> merge possible without repeatedly breaking compatibility. > >>>> > >>>> The first question is where do we put the code of the Blink fork > during > >>> the > >>>> merging procedure? > >>>> My first thought was to temporarily add a repository (like > >>>> "flink-blink-staging"), but we could > >>>> also put it into a special branch in the main Flink repository. > >>>> > >>>> > >>>> I will start a separate thread about discussing a possible strategy to > >>>> handle and merge > >>>> such a big contribution. > >>>> > >>>> Best, > >>>> Stephan > >>>> > >