Re:Re: Re: [VOTE] Accept Flink CDC into Apache Flink

2024-01-12 Thread Zhijiang
+1 (binding)
Best,
Zhijiang
--
From:Kurt Yang 
Send Time:2024年1月12日(星期五) 15:31
To:dev
Subject:Re: Re: Re: [VOTE] Accept Flink CDC into Apache Flink
+1 (binding)
Best,
Kurt
On Fri, Jan 12, 2024 at 2:21 PM Hequn Cheng  wrote:
> +1 (binding)
>
> Thanks,
> Hequn
>
> On Fri, Jan 12, 2024 at 2:19 PM godfrey he  wrote:
>
> > +1 (binding)
> >
> > Thanks,
> > Godfrey
> >
> > Zhu Zhu  于2024年1月12日周五 14:10写道:
> > >
> > > +1 (binding)
> > >
> > > Thanks,
> > > Zhu
> > >
> > > Hangxiang Yu  于2024年1月11日周四 14:26写道:
> > >
> > > > +1 (non-binding)
> > > >
> > > > On Thu, Jan 11, 2024 at 11:19 AM Xuannan Su 
> > wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Best,
> > > > > Xuannan
> > > > >
> > > > > On Thu, Jan 11, 2024 at 10:28 AM Xuyang 
> wrote:
> > > > > >
> > > > > > +1 (non-binding)--
> > > > > >
> > > > > > Best!
> > > > > > Xuyang
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > 在 2024-01-11 10:00:11,"Yang Wang"  写道:
> > > > > > >+1 (binding)
> > > > > > >
> > > > > > >
> > > > > > >Best,
> > > > > > >Yang
> > > > > > >
> > > > > > >On Thu, Jan 11, 2024 at 9:53 AM liu ron 
> > wrote:
> > > > > > >
> > > > > > >> +1 non-binding
> > > > > > >>
> > > > > > >> Best
> > > > > > >> Ron
> > > > > > >>
> > > > > > >> Matthias Pohl  于2024年1月10日周三
> > > > 23:05写道:
> > > > > > >>
> > > > > > >> > +1 (binding)
> > > > > > >> >
> > > > > > >> > On Wed, Jan 10, 2024 at 3:35 PM ConradJam <
> > jam.gz...@gmail.com>
> > > > > wrote:
> > > > > > >> >
> > > > > > >> > > +1 non-binding
> > > > > > >> > >
> > > > > > >> > > Dawid Wysakowicz  于2024年1月10日周三
> > > > 21:06写道:
> > > > > > >> > >
> > > > > > >> > > > +1 (binding)
> > > > > > >> > > > Best,
> > > > > > >> > > > Dawid
> > > > > > >> > > >
> > > > > > >> > > > On Wed, 10 Jan 2024 at 11:54, Piotr Nowojski <
> > > > > pnowoj...@apache.org>
> > > > > > >> > > wrote:
> > > > > > >> > > >
> > > > > > >> > > > > +1 (binding)
> > > > > > >> > > > >
> > > > > > >> > > > > śr., 10 sty 2024 o 11:25 Martijn Visser <
> > > > > martijnvis...@apache.org>
> > > > > > >> > > > > napisał(a):
> > > > > > >> > > > >
> > > > > > >> > > > > > +1 (binding)
> > > > > > >> > > > > >
> > > > > > >> > > > > > On Wed, Jan 10, 2024 at 4:43 AM Xingbo Huang <
> > > > > hxbks...@gmail.com
> > > > > > >> >
> > > > > > >> > > > wrote:
> > > > > > >> > > > > > >
> > > > > > >> > > > > > > +1 (binding)
> > > > > > >> > > > > > >
> > > > > > >> > > > > > > Best,
> > > > > > >> > > > > > > Xingbo
> > > > > > >> > > > > > >
> > > > > > >> > > > > > > Dian Fu  于2024年1月10日周三
> > 11:35写道:
> > > > > > >> > > > > > >
> > > > > > >> > > > > > > > +1 (binding)
> > > > > > >> > > > > > > >
> > > > 

Re: [VOTE] FLIP-151: Incremental snapshots for heap-based state backend

2021-03-03 Thread Zhijiang
+1 (binding) from my side, wish this feature going well!

Best,
Zhijiang


--
From:Piotr Nowojski 
Send Time:2021年3月2日(星期二) 00:11
To:dev ; roman 
Subject:Re: [VOTE] FLIP-151: Incremental snapshots for heap-based state backend

Thanks Roman for coming up with this proposal and driving this topic:

+1 (binding) from my side

Piotrek

pon., 1 mar 2021 o 10:12 Roman Khachatryan  napisał(a):

> Hi everyone,
>
> since the discussion [1] about FLIP-151 [2] seems to have reached a
> consensus, I'd like to start a formal vote for the FLIP.
>
> Please vote +1 to approve the FLIP, or -1 with a comment. The vote will be
> open at least until Wednesday, Mar 3rd.
>
> [1] https://s.apache.org/flip-151-discussion
> [2]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-151%3A+Incremental+snapshots+for+heap-based+state+backend
>
> Regards,
> Roman
>



Re: Re: [ANNOUNCE] New Apache Flink Committers - Wei Zhong and Xingbo Huang

2021-02-22 Thread Zhijiang
Congratulations Wei and Xingbo!


Best,
Zhijiang


--
From:Yun Tang 
Send Time:2021年2月23日(星期二) 10:58
To:Roman Khachatryan ; dev 
Subject:Re: Re: [ANNOUNCE] New Apache Flink Committers - Wei Zhong and Xingbo 
Huang

Congratulation!

Best
Yun Tang

From: Yun Gao 
Sent: Tuesday, February 23, 2021 10:56
To: Roman Khachatryan ; dev 
Subject: Re: Re: [ANNOUNCE] New Apache Flink Committers - Wei Zhong and Xingbo 
Huang

Congratulations Wei and Xingbo!

Best,
Yun


 --Original Mail --
Sender:Roman Khachatryan 
Send Date:Tue Feb 23 00:59:22 2021
Recipients:dev 
Subject:Re: [ANNOUNCE] New Apache Flink Committers - Wei Zhong and Xingbo Huang
Congratulations!

Regards,
Roman


On Mon, Feb 22, 2021 at 12:22 PM Yangze Guo  wrote:

> Congrats,  Well deserved!
>
> Best,
> Yangze Guo
>
> On Mon, Feb 22, 2021 at 6:47 PM Yang Wang  wrote:
> >
> > Congratulations Wei & Xingbo!
> >
> > Best,
> > Yang
> >
> > Rui Li  于2021年2月22日周一 下午6:23写道:
> >
> > > Congrats Wei & Xingbo!
> > >
> > > On Mon, Feb 22, 2021 at 4:24 PM Yuan Mei 
> wrote:
> > >
> > > > Congratulations Wei & Xingbo!
> > > >
> > > > Best,
> > > > Yuan
> > > >
> > > > On Mon, Feb 22, 2021 at 4:04 PM Yu Li  wrote:
> > > >
> > > > > Congratulations Wei and Xingbo!
> > > > >
> > > > > Best Regards,
> > > > > Yu
> > > > >
> > > > >
> > > > > On Mon, 22 Feb 2021 at 15:56, Till Rohrmann 
> > > > wrote:
> > > > >
> > > > > > Congratulations Wei & Xingbo. Great to have you as committers in
> the
> > > > > > community now.
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > > On Mon, Feb 22, 2021 at 5:08 AM Xintong Song <
> tonysong...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Congratulations, Wei & Xingbo~! Welcome aboard.
> > > > > > >
> > > > > > > Thank you~
> > > > > > >
> > > > > > > Xintong Song
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Feb 22, 2021 at 11:48 AM Dian Fu 
> > > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > On behalf of the PMC, I’m very happy to announce that Wei
> Zhong
> > > and
> > > > > > > Xingbo
> > > > > > > > Huang have accepted the invitation to become Flink
> committers.
> > > > > > > >
> > > > > > > > - Wei Zhong mainly works on PyFlink and has driven several
> > > > important
> > > > > > > > features in PyFlink, e.g. Python UDF dependency management
> > > > (FLIP-78),
> > > > > > > > Python UDF support in SQL (FLIP-106, FLIP-114), Python UDAF
> > > support
> > > > > > > > (FLIP-139), etc. He has contributed the first PR of PyFlink
> and
> > > > have
> > > > > > > > contributed 100+ commits since then.
> > > > > > > >
> > > > > > > > - Xingbo Huang's contribution is also mainly in PyFlink and
> has
> > > > > driven
> > > > > > > > several important features in PyFlink, e.g. performance
> > > optimizing
> > > > > for
> > > > > > > > Python UDF and Python UDAF (FLIP-121, FLINK-16747,
> FLINK-19236),
> > > > > Pandas
> > > > > > > > UDAF support (FLIP-137), Python UDTF support (FLINK-14500),
> > > > row-based
> > > > > > > > Operations support in Python Table API (FLINK-20479), etc.
> He is
> > > > also
> > > > > > > > actively helping on answering questions in the user mailing
> list,
> > > > > > helping
> > > > > > > > on the release check, monitoring the status of the azure
> > > pipeline,
> > > > > etc.
> > > > > > > >
> > > > > > > > Please join me in congratulating Wei Zhong and Xingbo Huang
> for
> > > > > > becoming
> > > > > > > > Flink committers!
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Dian
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best regards!
> > > Rui Li
> > >
>



Re: [ANNOUNCE] Welcome Roman Khachatryan a new Apache Flink Committer

2021-02-18 Thread Zhijiang
 Congrats Roman, :)


Best,
Zhijiang


--
From:David Anderson 
Send Time:2021年2月18日(星期四) 23:33
To:dev 
Cc:ro...@apache.org 
Subject:Re: [ANNOUNCE] Welcome Roman Khachatryan a new Apache Flink Committer

Congratulations, Roman! Glad to have you onboard!!

David

On Thu, Feb 18, 2021 at 10:51 AM Congxian Qiu 
wrote:

> Congratulations, Roman
> Best,
> Congxian
>
>
> Leonard Xu  于2021年2月18日周四 下午1:47写道:
>
> > Congrats Roman!
> >
> > Best,
> > Leonard
> >
> > > 在 2021年2月18日,11:10,Yu Li  写道:
> > >
> > > Congratulations, Roman!
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Thu, 18 Feb 2021 at 11:05, Xingbo Huang  wrote:
> > >
> > >> Congratulations Roman!
> > >>
> > >> Best,
> > >> Xingbo
> > >>
> > >> Yang Wang  于2021年2月18日周四 上午10:29写道:
> > >>
> > >>> Congrats Roman!
> > >>>
> > >>> Best,
> > >>> Yang
> > >>>
> > >>> Xintong Song  于2021年2月18日周四 上午10:00写道:
> > >>>
> > >>>> Congratulations, Roman~!
> > >>>>
> > >>>> Thank you~
> > >>>>
> > >>>> Xintong Song
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Thu, Feb 18, 2021 at 9:42 AM Dian Fu 
> > wrote:
> > >>>>
> > >>>>> Congratulations, Roman!
> > >>>>>
> > >>>>> Regards,
> > >>>>> Dian
> > >>>>>
> > >>>>>> 在 2021年2月16日,下午5:56,Yuan Mei  写道:
> > >>>>>>
> > >>>>>> Well deserved! Congrats Roman!
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> Yuan
> > >>>>>>
> > >>>>>> On Tue, Feb 16, 2021 at 5:10 PM Guowei Ma 
> > >>>> wrote:
> > >>>>>>
> > >>>>>>> Congratulations Roman!
> > >>>>>>> Best,
> > >>>>>>> Guowei
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Thu, Feb 11, 2021 at 3:37 PM Yun Tang 
> > >> wrote:
> > >>>>>>>
> > >>>>>>>> Congratulations, Roman!
> > >>>>>>>>
> > >>>>>>>> Today is also the beginning of Chinese Spring Festival holiday,
> > >> at
> > >>>>> which
> > >>>>>>>> we Chinese celebrate across the world for the next lunar new
> > >> year,
> > >>>> and
> > >>>>>>> also
> > >>>>>>>> very happy to have you on board!
> > >>>>>>>>
> > >>>>>>>> Best
> > >>>>>>>> Yun Tang
> > >>>>>>>> 
> > >>>>>>>> From: Roman Khachatryan 
> > >>>>>>>> Sent: Thursday, February 11, 2021 4:03
> > >>>>>>>> To: matth...@ververica.com 
> > >>>>>>>> Cc: dev 
> > >>>>>>>> Subject: Re: [ANNOUNCE] Welcome Roman Khachatryan a new Apache
> > >>> Flink
> > >>>>>>>> Committer
> > >>>>>>>>
> > >>>>>>>> Many thanks to all of you!
> > >>>>>>>>
> > >>>>>>>> Regards,
> > >>>>>>>> Roman
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On Wed, Feb 10, 2021 at 7:12 PM Matthias Pohl <
> > >>>> matth...@ververica.com>
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Congratulations, Roman! :-)
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Feb 10, 2021 at 3:23 PM Kezhu Wang 
> > >>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Congratulations!
> > >>>>>>>>>>
> > >>>>>>&

Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-21 Thread Zhijiang
Congrats, Guowei!


Best,
Zhijiang


--
From:Biao Liu 
Send Time:2021年1月21日(星期四) 14:45
To:dev 
Subject:Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

Congrats, Guowei!

Thanks,
Biao /'bɪ.aʊ/



On Thu, 21 Jan 2021 at 09:30, Paul Lam  wrote:

> Congrats, Guowei!
>
> Best,
> Paul Lam
>
> > 2021年1月21日 07:21,Steven Wu  写道:
> >
> > Congrats, Guowei!
> >
> > On Wed, Jan 20, 2021 at 10:32 AM Seth Wiesman 
> wrote:
> >
> >> Congratulations!
> >>
> >> On Wed, Jan 20, 2021 at 3:41 AM hailongwang <18868816...@163.com>
> wrote:
> >>
> >>> Congratulations, Guowei!
> >>>
> >>> Best,
> >>> Hailong
> >>>
> >>> 在 2021-01-20 15:55:24,"Till Rohrmann"  写道:
> >>>> Congrats, Guowei!
> >>>>
> >>>> Cheers,
> >>>> Till
> >>>>
> >>>> On Wed, Jan 20, 2021 at 8:32 AM Matthias Pohl  >
> >>>> wrote:
> >>>>
> >>>>> Congrats, Guowei!
> >>>>>
> >>>>> On Wed, Jan 20, 2021 at 8:22 AM Congxian Qiu  >
> >>>>> wrote:
> >>>>>
> >>>>>> Congrats Guowei!
> >>>>>>
> >>>>>> Best,
> >>>>>> Congxian
> >>>>>>
> >>>>>>
> >>>>>> Danny Chan  于2021年1月20日周三 下午2:59写道:
> >>>>>>
> >>>>>>> Congratulations Guowei!
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Danny
> >>>>>>>
> >>>>>>> Jark Wu  于2021年1月20日周三 下午2:47写道:
> >>>>>>>
> >>>>>>>> Congratulations Guowei!
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Jark
> >>>>>>>>
> >>>>>>>> On Wed, 20 Jan 2021 at 14:36, SHI Xiaogang <
> >>> shixiaoga...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Congratulations MA!
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Xiaogang
> >>>>>>>>>
> >>>>>>>>> Yun Tang  于2021年1月20日周三 下午2:24写道:
> >>>>>>>>>
> >>>>>>>>>> Congratulations Guowei!
> >>>>>>>>>>
> >>>>>>>>>> Best
> >>>>>>>>>> Yun Tang
> >>>>>>>>>> 
> >>>>>>>>>> From: Yang Wang 
> >>>>>>>>>> Sent: Wednesday, January 20, 2021 13:59
> >>>>>>>>>> To: dev 
> >>>>>>>>>> Subject: Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new
> >> Apache
> >>>>> Flink
> >>>>>>>>>> Committer
> >>>>>>>>>>
> >>>>>>>>>> Congratulations Guowei!
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Yang
> >>>>>>>>>>
> >>>>>>>>>> Yun Gao  于2021年1月20日周三
> >>> 下午1:52写道:
> >>>>>>>>>>
> >>>>>>>>>>> Congratulations Guowei!
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>>
> >>>>>>>>
> >>> Yun--
> >>>>>>>>>>> Sender:Yangze Guo
> >>>>>>>>>>> Date:2021/01/20 13:48:52
> >>>>>>>>>>> Recipient:dev
> >>>>>>>>>>> Theme:Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache
> >> Flink
> >>>>>>>> Committer
> >>>>>>>>>>>
> >>>>>>>>>>> Congratulations, Guowei! Well deserved.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Yangze Guo
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Jan 20, 2021 at 1:46 PM Xintong Song <
> >>>>>>> tonysong...@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Congratulations, Guowei~!
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thank you~
> >>>>>>>>>>>>
> >>>>>>>>>>>> Xintong Song
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Jan 20, 2021 at 1:42 PM Yuan Mei <
> >>>>>> yuanmei.w...@gmail.com
> >>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Congrats Guowei :-)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Best,
> >>>>>>>>>>>>> Yuan
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, Jan 20, 2021 at 1:36 PM tison <
> >>>>> wander4...@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Congrats Guowei!
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>> tison.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Kurt Young  于2021年1月20日周三
> >> 下午1:34写道:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I'm very happy to announce that Guowei Ma has
> >>> accepted
> >>>>>> the
> >>>>>>>>>>> invitation
> >>>>>>>>>>>>> to
> >>>>>>>>>>>>>>> become a Flink committer.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Guowei is a very long term Flink developer, he has
> >>> been
> >>>>>>>>> extremely
> >>>>>>>>>>>>> helpful
> >>>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>> some important runtime changes, and also been
> >>> active
> >>>>>> with
> >>>>>>>>>>> answering
> >>>>>>>>>>>>> user
> >>>>>>>>>>>>>>> questions as well as discussing designs.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Please join me in congratulating Guowei for
> >>> becoming a
> >>>>>>> Flink
> >>>>>>>>>>> committer!
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>> Kurt
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
>
>



Re: [DISCUSS] FLIP-152: Hive Query Syntax Compatibility

2020-12-10 Thread Zhijiang
Thanks for the further info and explanations! I have no other concerns.

Best,
Zhijiang


--
From:Rui Li 
Send Time:2020年12月10日(星期四) 20:35
To:dev ; Zhijiang 
Subject:Re: [DISCUSS] FLIP-152: Hive Query Syntax Compatibility

Hi Zhijiang,

Glad to know you're interested in this FLIP. I wouldn't claim 100%
compatibility with this FLIP. That's because Flink doesn't have the
functionalities to support all Hive's features. To list a few examples:

   1. Hive allows users to process data with shell scripts -- very similar
   to UDFs [1]
   2. Users can compile inline Groovy UDFs and use them in queries [2]
   3. Users can dynamically add/delete jars, or even execute arbitrary
   shell command [3]

These features cannot be supported merely by a parser/planner, and it's
open to discussion whether Flink even should support them at all.

So the ultimate goal of this FLIP is to provide Hive syntax compatibility
to features that are already available in Flink, which I believe will cover
most common use cases.

[1]
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Transform#LanguageManualTransform-TRANSFORMExamples
[2]
https://community.cloudera.com/t5/Community-Articles/Apache-Hive-Groovy-UDF-examples/ta-p/245060
[3]
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli#LanguageManualCli-HiveInteractiveShellCommands

On Thu, Dec 10, 2020 at 6:11 PM Zhijiang 
wrote:

> Thanks for launching the discussion and the FLIP, Rui!
>
> It is really nice to see our continuous efforts for compatibility with
> Hive and benefiting users in this area.
> I am only curious that are there any other compatible limitations for Hive
> users after this FLIP? Or can I say that the Hive compatibility is
> completely resolved after this FLIP?
> I am interested in the ultimate goal in this area. Maybe it is out of this
> FLIP scope, but still wish some insights from you if possible. :)
>
> Best,
> Zhijiang
>
>
> --
> From:Rui Li 
> Send Time:2020年12月10日(星期四) 16:46
> To:dev 
> Subject:Re: [DISCUSS] FLIP-152: Hive Query Syntax Compatibility
>
> Thanks Kurt for your inputs!
>
> I agree we should extend Hive code to support non-Hive tables. I have
> updated the wiki page to remove the limitations you mentioned, and add
> typical use cases in the "Motivation" section.
>
> Regarding comment #b, the interface is defined in flink-table-planner-blink
> and only used by the blink planner. So I think "BlinkParserFactory" is a
> better name, WDYT?
>
> On Mon, Dec 7, 2020 at 12:28 PM Kurt Young  wrote:
>
> > Thanks Rui for starting this discussion.
> >
> > I can see the benefit that we improve hive compatibility further, as
> quite
> > some users are asking for this
> > feature in mailing lists [1][2][3] and some online chatting tools such as
> > DingTalk.
> >
> > I have 3 comments regarding to the design doc:
> >
> > a) Could you add a section to describe the typical use case you want to
> > support after this feature is introduced?
> > In that way, users can also have an impression how to use this feature
> and
> > what the behavior and outcome will be.
> >
> > b) Regarding the naming: "BlinkParserFactory", I suggest renaming it to
> > "FlinkParserFactory".
> >
> > c) About the two limitations you mentioned:
> > 1. Only works with Hive tables and the current catalog needs to be a
> > HiveCatalog.
> > 2. Queries cannot involve tables/views from multiple catalogs.
> > I assume this is because hive parser and analyzer doesn't support
> > referring to a name with "x.y.z" fashion? Since
> > we can control all the behaviors by leveraging the codes hive currently
> > use. Is it possible that we can remove such
> > limitations? The reason is I'm not sure if users can make the whole story
> > work purely depending on hive catalog (that's
> > the reason why I gave comment #a). If multiple catalogs are involved,
> with
> > this limitation I don't think any meaningful
> > pipeline could be built. For example, users want to stream data from
> Kafka
> > to Hive, fully use hive's dialect including
> > query part. The kafka table could be a temporary table or saved in
> default
> > memory catalog.
> >
> >
> > [1] http://apache-flink.147419.n8.nabble.com/calcite-td9059.html#a9118
> > [2]
> http://apache-flink.147419.n8.nabble.com/hive-sql-flink-11-td9116.html
> > [3]
> >
> >
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-to-in-Flink-to-support-below-HIVE-SQL-td34162.html
> >
&

Re: [DISCUSS] FLIP-152: Hive Query Syntax Compatibility

2020-12-10 Thread Zhijiang
Thanks for launching the discussion and the FLIP, Rui!

It is really nice to see our continuous efforts for compatibility with Hive and 
benefiting users in this area.
I am only curious that are there any other compatible limitations for Hive 
users after this FLIP? Or can I say that the Hive compatibility is completely 
resolved after this FLIP?
I am interested in the ultimate goal in this area. Maybe it is out of this FLIP 
scope, but still wish some insights from you if possible. :)

Best,
Zhijiang


--
From:Rui Li 
Send Time:2020年12月10日(星期四) 16:46
To:dev 
Subject:Re: [DISCUSS] FLIP-152: Hive Query Syntax Compatibility

Thanks Kurt for your inputs!

I agree we should extend Hive code to support non-Hive tables. I have
updated the wiki page to remove the limitations you mentioned, and add
typical use cases in the "Motivation" section.

Regarding comment #b, the interface is defined in flink-table-planner-blink
and only used by the blink planner. So I think "BlinkParserFactory" is a
better name, WDYT?

On Mon, Dec 7, 2020 at 12:28 PM Kurt Young  wrote:

> Thanks Rui for starting this discussion.
>
> I can see the benefit that we improve hive compatibility further, as quite
> some users are asking for this
> feature in mailing lists [1][2][3] and some online chatting tools such as
> DingTalk.
>
> I have 3 comments regarding to the design doc:
>
> a) Could you add a section to describe the typical use case you want to
> support after this feature is introduced?
> In that way, users can also have an impression how to use this feature and
> what the behavior and outcome will be.
>
> b) Regarding the naming: "BlinkParserFactory", I suggest renaming it to
> "FlinkParserFactory".
>
> c) About the two limitations you mentioned:
> 1. Only works with Hive tables and the current catalog needs to be a
> HiveCatalog.
> 2. Queries cannot involve tables/views from multiple catalogs.
> I assume this is because hive parser and analyzer doesn't support
> referring to a name with "x.y.z" fashion? Since
> we can control all the behaviors by leveraging the codes hive currently
> use. Is it possible that we can remove such
> limitations? The reason is I'm not sure if users can make the whole story
> work purely depending on hive catalog (that's
> the reason why I gave comment #a). If multiple catalogs are involved, with
> this limitation I don't think any meaningful
> pipeline could be built. For example, users want to stream data from Kafka
> to Hive, fully use hive's dialect including
> query part. The kafka table could be a temporary table or saved in default
> memory catalog.
>
>
> [1] http://apache-flink.147419.n8.nabble.com/calcite-td9059.html#a9118
> [2] http://apache-flink.147419.n8.nabble.com/hive-sql-flink-11-td9116.html
> [3]
>
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-to-in-Flink-to-support-below-HIVE-SQL-td34162.html
>
> Best,
> Kurt
>
>
> On Wed, Dec 2, 2020 at 10:02 PM Rui Li  wrote:
>
> > Hi guys,
> >
> > I'd like to start a discussion about providing HiveQL compatibility for
> > users connecting to a hive warehouse. FLIP-123 has already covered most
> > DDLs. So now it's time to complement the other big missing part --
> queries.
> > With FLIP-152, the hive dialect covers more scenarios and makes it even
> > easier for users to migrate to Flink. More details are in the FLIP wiki
> > page [1]. Looking forward to your feedback!
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-152%3A+Hive+Query+Syntax+Compatibility
> >
> > --
> > Best regards!
> > Rui Li
> >
>


-- 
Best regards!
Rui Li



Re: Re: [ANNOUNCE] New Apache Flink Committer - Congxian Qiu

2020-10-30 Thread Zhijiang
Congrats, Congxian!


--
From:Yun Gao 
Send Time:2020年10月29日(星期四) 21:21
To:Leonard Xu ; dev 
Subject:Re: Re: [ANNOUNCE] New Apache Flink Committer - Congxian Qiu

Congratulations Congxian !

Best,
 Yun
--
Sender:Leonard Xu
Date:2020/10/29 21:19:56
Recipient:dev
Theme:Re: [ANNOUNCE] New Apache Flink Committer - Congxian Qiu

Congratulations! Congxian

Best,
Leonard
> 在 2020年10月29日,20:55,Stephan Ewen  写道:
> 
> Congrats, Congxian!
> 
> On Thu, Oct 29, 2020 at 12:21 PM Shawn Huang  wrote:
> 
>> Congratulations!
>> 
>> Best,
>> Shawn Huang
>> 
>> 
>> hailongwang <18868816...@163.com> 于2020年10月29日周四 下午7:11写道:
>> 
>>> Congratulations, Congxian!
>>> 
>>> Best,
>>> Hailong Wang
>>> At 2020-10-29 16:45:19, "Yun Tang"  wrote:
 Congratulations, Congxian!
 
 Best
 Yun Tang
 
 From: Xintong Song 
 Sent: Thursday, October 29, 2020 16:16
 To: dev 
 Cc: Congxian Qiu ; klio...@apache.org <
>>> klio...@apache.org>
 Subject: Re: [ANNOUNCE] New Apache Flink Committer - Congxian Qiu
 
 Congratulations, Congxian.
 
 Thank you~
 
 Xintong Song
 
 
 
 On Thu, Oct 29, 2020 at 3:57 PM Jingsong Li 
>>> wrote:
 
> Congratulations!
> 
> On Thu, Oct 29, 2020 at 3:56 PM Yuan Mei 
>>> wrote:
> 
>> Congratulations!
>> 
>> On Thu, Oct 29, 2020 at 3:53 PM Till Rohrmann >> 
>> wrote:
>> 
>>> Congratulations Congxian! Great to have you as a committer now :-)
>>> 
>>> Cheers,
>>> Till
>>> 
>>> On Thu, Oct 29, 2020 at 8:33 AM Benchao Li 
> wrote:
>>> 
 Congratulations!
 
 Biao Liu  于2020年10月29日周四 下午3:20写道:
 
> Congrads!
> 
> Thanks,
> Biao /'bɪ.aʊ/
> 
> 
> 
> On Thu, 29 Oct 2020 at 15:14, Xingbo Huang <
>> hxbks...@gmail.com>
>> wrote:
> 
>> Congratulations Congxian.
>> 
>> Best,
>> Xingbo
>> 
>> Dian Fu  于2020年10月29日周四 下午3:05写道:
>> 
>>> Congratulations Congxian!
>>> 
>>> Regards,
>>> Dian
>>> 
 在 2020年10月29日,下午2:35,Yangze Guo 
>> 写道:
 
 Congratulations!
 
 Best,
 Yangze Guo
 
 On Thu, Oct 29, 2020 at 2:31 PM Jark Wu <
>> imj...@gmail.com
 
>>> wrote:
> 
> Congrats Congxian!
> 
> Best,
> Jark
> 
> On Thu, 29 Oct 2020 at 14:28, Yu Li 
> wrote:
> 
>> Hi all,
>> 
>> On behalf of the PMC, I’m very happy to announce
>>> Congxian
> Qiu
>>> as
 a
>> new
>> Flink committer.
>> 
>> Congxian has been an active contributor for more than
>>> two
>>> years,
> with
>>> 226
>> contributions including 76 commits and many PR
>> reviews.
>> 
>> Congxian mainly works on state backend and checkpoint
>> modules,
>>> meantime is
>> one of the main maintainers of our Chinese document
>>> translation.
>> 
>> Besides his work on the code, he has been driving
> initiatives
>>> on
> dev@
>> list,
>> supporting users and giving talks at conferences.
>> 
>> Please join me in congratulating Congxian for
>> becoming a
>> Flink
>>> committer!
>> 
>> Cheers,
>> Yu
>> 
>>> 
>>> 
>> 
> 
 
 
 --
 
 Best,
 Benchao Li
 
>>> 
>> 
> 
> 
> --
> Best, Jingsong Lee
> 
>>> 
>> 



Re: [DISCUSS] 1.12 feature freeze date += one week ?

2020-10-29 Thread Zhijiang
Thanks for the consideration, stephan!

+1 for one week extension of feature freeze, it would be really helpful for 
ongoing batch shuffle improvement.

Best,
Zhijiang
--
From:Yuan Mei 
Send Time:2020年10月29日(星期四) 11:03
To:dev 
Subject:Re: [DISCUSS] 1.12 feature freeze date += one week ?

Hey Stephan,

Thanks for bringing this up!

+1 to one week extension for 1.12 feature freeze.

Things I am working on are under review and one week would definitely ease
my (and reviewers') life a lot.

Best,

Yuan

On Thu, Oct 29, 2020 at 2:13 AM Stephan Ewen  wrote:

> Hi all!
>
> We are approaching the feature freeze date for the 1.12 release that was
> discussed earlier.
>
> From my side and the developments I am involved with, we are close and in
> good shape, but could really use one more week to round things off. It
> would help both the code quality and our mental health a lot :-)
>
> From some personal conversations I heard at least from some other
> committers a similar sentiment.
>
> My proposal would hence be to set the 1.12 cutoff date to the weekend of
> the 7th/8th and fork the release-1.12 branch on Monday Nov. 9th.
>
> What do you think?
>
> Best,
> Stephan
>



Re: [VOTE] FLIP-148: Introduce Sort-Merge Based Blocking Shuffle to Flink

2020-10-25 Thread Zhijiang
Thanks for driving this improvement, Yingjie! 

+1 (binding)

Best,
Zhijiang


--
From:Kurt Young 
Send Time:2020年10月26日(星期一) 11:41
To:dev 
Subject:Re: [VOTE] FLIP-148: Introduce Sort-Merge Based Blocking Shuffle to 
Flink

+1 (binding)

Best,
Kurt


On Mon, Oct 26, 2020 at 11:19 AM Yingjie Cao 
wrote:

> Hi devs,
>
> I'd like to start a vote for FLIP-148: Introduce Sort-Merge Based Blocking
> Shuffle to Flink [1] which is discussed in discussion thread [2].
>
> The vote will last for at least 72 hours until a consensus voting.
>
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-148%3A+Introduce+Sort-Merge+Based+Blocking+Shuffle+to+Flink
> [2]
>
> https://lists.apache.org/thread.html/r11750db945277d944f408eaebbbdc9d595d587fcfb67b015c716404e%40%3Cdev.flink.apache.org%3E
>



[jira] [Created] (FLINK-19745) Supplement micro-benchmark for bounded blocking partition in remote channel case

2020-10-20 Thread Zhijiang (Jira)
Zhijiang created FLINK-19745:


 Summary: Supplement micro-benchmark for bounded blocking partition 
in remote channel case
 Key: FLINK-19745
 URL: https://issues.apache.org/jira/browse/FLINK-19745
 Project: Flink
  Issue Type: Task
  Components: Benchmarks, Runtime / Network
Reporter: Zhijiang
Assignee: Zhijiang


The current benchmark `BlockingPartitionBenchmark` for batch job only measures 
the scenario of producer & consumer deployment in the same processor, that 
corresponds to the local input channel on consumer side. 

We want to supplement another common scenario to measure the effect of reading 
data via network shuffle, which corresponds to the remote input channel on 
consumer side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] FLIP-135: Approximate Task-Local Recovery

2020-10-19 Thread Zhijiang
Thanks for driving this effort, Yuan. 

+1 (binding) on my side.

Best,
Zhijiang


--
From:Piotr Nowojski 
Send Time:2020年10月19日(星期一) 21:02
To:dev 
Subject:Re: [VOTE] FLIP-135: Approximate Task-Local Recovery

Hey,

I carry over my +1 (binding) from the discussion thread.

Best,
Piotrek

pon., 19 paź 2020 o 14:56 Yuan Mei  napisał(a):

> Hey,
>
> I would like to start a voting thread for FLIP-135 [1], for approximate
> task local recovery. The proposal has been discussed in [2].
>
> The vote will be open till Oct. 23rd (72h, excluding weekends) unless there
> is an objection or not enough votes.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-135+Approximate+Task-Local+Recovery
> [2]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-135-Approximate-Task-Local-Recovery-tp43930.html
>
>
> Best
>
> Yuan
>



Re: [DISCUSS] FLIP-148: Introduce Sort-Merge Based Blocking Shuffle to Flink

2020-10-19 Thread Zhijiang
Thanks for launching the discussion and the respective FLIP, Yingjie!

In general, I am +1 for this proposal since sort-merge ability has already been 
taken widely in other batch-based project, like MR, Spark, etc.
And it indeed has some performance benefits in some scenarios as mentioned in 
FLIP.

I only have some thoughts for the section of `Public Interfaces` since it cares 
about how the users understand and better use in practice.
 As for the new introduced classes, the can be further reviewed in follow up PR 
since without existing interfaces refactoring ATM.

1. taskmanager.network.sort-merge-blocking-shuffle.max-files-per-partition: the 
default value should be `1` I guess?  It is better to give a proper default 
value that most of users do not need to
 care about it in practice.

2. taskmanager.network.sort-merge-blocking-shuffle.buffers-per-partition: how 
about making the default for the number of required buffers in LocalBufferPool 
as now for result partition?
 Then it is transparent for users to not increase any memory resource no matter 
with either hash based or sort-merge based way. For the tuned setting , it is 
better to give some hints to guide
 users how to adjust it for better performance based on some factors.

3. taskmanager.network.sort-merge-blocking-shuffle.min-parallelism: I guess it 
is not very easy or determined to give a proper value for the switch between 
hash based and sort-merge based.
 And how much data a subpartition taking (broadcast) or not suitable for hash 
based is not completely decided by the number of parallelism somehow. And users 
might be confused how to tune
 it in practice. I prefer to giving a simple boolean type option for easy use 
and the default value can be false in MVP. Then it will not bring any effects 
for users after upgrade to new version by default,
 and sort-merge option can be enabled to try out if users willing in desired 
scenarios. 

Best,
Zhijiang
--
From:Till Rohrmann 
Send Time:2020年10月16日(星期五) 15:42
To:dev 
Subject:Re: [DISCUSS] FLIP-148: Introduce Sort-Merge Based Blocking Shuffle to 
Flink

Thanks for sharing the preliminary numbers with us Yingjie. The numbers
look quite impressive :-)

Cheers,
Till

On Thu, Oct 15, 2020 at 5:25 PM Yingjie Cao  wrote:

> Hi Till,
>
> Thanks for your reply and comments.
>
> You are right, the proposed sort-merge based shuffle is an extension of the
> existing blocking shuffle and does not change any default behavior of
> Flink.
>
> As for the performance, according to our previous experience, sort-merge
> based implementation can reduce the shuffle time by 30% to even 90%
> compared to hash-based implementation. My PoC implementation without any
> further optimization can already reduce the shuffle time over 10% on SSD
> and over 70% on HDD for a simple 1000 * 1000 parallelism benchmark job.
>
> After switch to sort-merge based blocking shuffle, some of our users' jobs
> can scale up to over 2 parallelism, though need some JM and RM side
> optimization. I haven't ever tried to find where the upper bound is, but I
> guess sever tens of thousand should be able to m
> <
> http://www.baidu.com/link?url=g0rAiJfPTxlMOJ4v6DXQeXhu5Y5HroJ1HHBHo34fjTZ5mtC0aYfog4eRKEnJmoPaImLyFafqncmA7l3Zowb8vovv6Dy9VhO3TlAtjNqoV-W
> >eet
> the needs of most users.
>
> Best,
> Yingjie
>
> Till Rohrmann  于2020年10月15日周四 下午3:57写道:
>
> > Hi Yingjie,
> >
> > thanks for proposing the sort-merge based blocking shuffle. I like the
> > proposal and it does not seem to change the internals of Flink. Instead
> it
> > is an extension of existing interfaces which makes it a
> > non-invasive addition.
> >
> > Do you have any numbers comparing the performance of the sort-merge based
> > shuffle against the hash-based shuffle? To what parallelism can you scale
> > up when using the sort-merge based shuffle?
> >
> > Cheers,
> > Till
> >
> > On Thu, Oct 15, 2020 at 5:03 AM Yingjie Cao 
> > wrote:
> >
> > > Hi devs,
> > >
> > > Currently, Flink adopts a hash-style blocking shuffle implementation
> > which
> > > writes data sent to different reducer tasks into separate files
> > > concurrently. Compared to sort-merge based approach writes those data
> > > together into a single file and merges those small files into bigger
> > ones,
> > > hash-based approach has several weak points when it comes to running
> > large
> > > scale batch jobs:
> > >
> > >1. *Stability*: For high parallelism (tens of thousands) batch job,
> > >current hash-based blocking shuffle implementation writes too many
> > files
> > >concurrently which gives high pressure to 

Re: [ANNOUNCE] New PMC member: Zhu Zhu

2020-10-09 Thread Zhijiang
Congratulations and welcome, Zhu Zhu!

Best,
Zhijiang
--
From:Yun Tang 
Send Time:2020年10月9日(星期五) 14:20
To:dev@flink.apache.org 
Subject:Re: [ANNOUNCE] New PMC member: Zhu Zhu

Congratulations, Zhu!

Best
Yun Tang

From: Danny Chan 
Sent: Friday, October 9, 2020 13:51
To: dev@flink.apache.org 
Subject: Re: [ANNOUNCE] New PMC member: Zhu Zhu

Congrats, Zhu Zhu ~

Best,
Danny Chan
在 2020年10月9日 +0800 PM1:05,dev@flink.apache.org,写道:
>
> Congrats, Zhu Zhu



[jira] [Created] (FLINK-19551) Follow up improvements for shuffle service

2020-10-09 Thread Zhijiang (Jira)
Zhijiang created FLINK-19551:


 Summary: Follow up improvements for shuffle service 
 Key: FLINK-19551
 URL: https://issues.apache.org/jira/browse/FLINK-19551
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination, Runtime / Network
Reporter: Zhijiang


After resolving the core architecture and functions of pluggable shuffle 
service proposed by FLINK-10653, there are still some pending followup issues 
to be traced future in this umbrella ticket with low priority.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] Apache Flink 1.11.2 released

2020-09-18 Thread Zhijiang
Congratulations! Thanks for being the release manager Zhu Zhu and everyone 
involved in!


Best,
Zhijiang


--
From:Lijie Wang 
Send Time:2020年9月18日(星期五) 17:48
To:dev@flink.apache.org 
Subject:Re:[ANNOUNCE] Apache Flink 1.11.2 released

Congratulations! Thanks @ZhuZhu for driving this release ! 




On 09/17/2020 13:29,Zhu Zhu wrote:
The Apache Flink community is very happy to announce the release of Apache
Flink 1.11.2, which is the second bugfix release for the Apache Flink 1.11
series.

Apache Flink(r) is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
applications.

The release is available for download at:
https://flink.apache.org/downloads.html

Please check out the release blog post for an overview of the improvements
for this bugfix release:
https://flink.apache.org/news/2020/09/17/release-1.11.2.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12348575

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Thanks,
Zhu



Re: Re: [ANNOUNCE] New Apache Flink Committer - Igal Shilman

2020-09-15 Thread Zhijiang
Congratulations and welcome, Igal!


--
From:Yun Gao 
Send Time:2020年9月16日(星期三) 10:59
To:Stephan Ewen ; dev 
Subject:Re: Re: [ANNOUNCE] New Apache Flink Committer - Igal Shilman

Congratulations Igal!

Best,
 Yun






--
Sender:Stephan Ewen
Date:2020/09/15 22:48:30
Recipient:dev
Theme:Re: [ANNOUNCE] New Apache Flink Committer - Igal Shilman

Welcome, Igal!

On Tue, Sep 15, 2020 at 3:18 PM Seth Wiesman  wrote:

> Congrats Igal!
>
> On Tue, Sep 15, 2020 at 7:13 AM Benchao Li  wrote:
>
> > Congratulations!
> >
> > Zhu Zhu  于2020年9月15日周二 下午6:51写道:
> >
> > > Congratulations, Igal!
> > >
> > > Thanks,
> > > Zhu
> > >
> > > Rafi Aroch  于2020年9月15日周二 下午6:43写道:
> > >
> > > > Congratulations Igal! Well deserved!
> > > >
> > > > Rafi
> > > >
> > > >
> > > > On Tue, Sep 15, 2020 at 11:14 AM Tzu-Li (Gordon) Tai <
> > > tzuli...@apache.org>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > It's great seeing many new Flink committers recently, and to add to
> > > that
> > > > > I'd like to announce one more new committer: Igal Shilman!
> > > > >
> > > > > Igal has been a long time member of the community. You may very
> > likely
> > > > know
> > > > > Igal from the Stateful Functions sub-project, as he was the
> original
> > > > author
> > > > > of it before it was contributed to Flink.
> > > > > Ever since StateFun was contributed to Flink, he has consistently
> > > > > maintained the project and supported users in the mailing lists.
> > > > > Before that, he had also helped tremendously in some work on
> Flink's
> > > > > serialization stack.
> > > > >
> > > > > Please join me in welcoming and congratulating Igal for becoming a
> > > Flink
> > > > > committer!
> > > > >
> > > > > Cheers,
> > > > > Gordon
> > > > >
> > > >
> > >
> >
> >
> > --
> >
> > Best,
> > Benchao Li
> >
>




Re: [ANNOUNCE] New Apache Flink Committer - Yun Tang

2020-09-15 Thread Zhijiang
Congratulations and welcome, Yun!


--
From:Jark Wu 
Send Time:2020年9月16日(星期三) 11:35
To:dev 
Cc:tangyun ; Yun Tang 
Subject:Re: [ANNOUNCE] New Apache Flink Committer - Yun Tang

Congratulations Yun!

On Wed, 16 Sep 2020 at 10:40, Rui Li  wrote:

> Congratulations Yun!
>
> On Wed, Sep 16, 2020 at 10:20 AM Paul Lam  wrote:
>
> > Congrats, Yun! Well deserved!
> >
> > Best,
> > Paul Lam
> >
> > > 2020年9月15日 19:14,Yang Wang  写道:
> > >
> > > Congratulations, Yun!
> > >
> > > Best,
> > > Yang
> > >
> > > Leonard Xu  于2020年9月15日周二 下午7:11写道:
> > >
> > >> Congrats, Yun!
> > >>
> > >> Best,
> > >> Leonard
> > >>> 在 2020年9月15日,19:01,Yangze Guo  写道:
> > >>>
> > >>> Congrats, Yun!
> > >>
> > >>
> >
> >
>
> --
> Best regards!
> Rui Li
>



Re: [VOTE] Release 1.11.2, release candidate #1

2020-09-14 Thread Zhijiang
+1 (binding)

- checked the checksums and GPG files 
- verified that the source archives do not contains any binaries
- checked that all POM files point to the same version
- reviewed the web site PR https://github.com/apache/flink-web/pull/377
- checked the release note

Best,
Zhijiang


--
From:Dian Fu 
Send Time:2020年9月15日(星期二) 10:11
To:dev 
Subject:Re: [VOTE] Release 1.11.2, release candidate #1

+1 (binding)

- checked the signature and checksum
- reviewed the web-site PR and it looks good to me
- checked the diff for dependencies changes since 1.11.1: 
https://github.com/apache/flink/compare/release-1.11.1..release-1.11.2-rc1 
<https://github.com/apache/flink/compare/release-1.11.1..release-1.11.2-rc1>
- checked the release note

Thanks,
Dian

> 在 2020年9月14日,下午9:30,Xingbo Huang  写道:
> 
> +1 (non-binding)
> 
> Checks:
> 
> - Pip install PyFlink from wheel packages with Python 3.5,3.6 and 3.7 in
> Mac and Linux.
> - Test Python UDF/Pandas UDF
> - Test from_pandas/to_pandas
> 
> Best,
> Xingbo
> 
> Fabian Paul  于2020年9月14日周一 下午8:46写道:
> 
>> +1 (non-binding)
>> 
>> Checks:
>> 
>> - Verified signature
>> - Built from source (Java8)
>> - Ran custom jobs on Kubernetes
>> 
>> Regards,
>> Fabian
>> 




Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Zhijiang
Congrats, Niels!

Best,
Zhijiang


--
From:Darion Yaphet 
Send Time:2020年9月15日(星期二) 10:02
To:dev 
Subject:Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

Congratulations!

刘建刚  于2020年9月15日周二 上午9:53写道:

> Congratulations!
>
> Best,
> liujiangang
>
> Danny Chan  于2020年9月15日周二 上午9:44写道:
>
> > Congratulations! 
> >
> > Best,
> > Danny Chan
> > 在 2020年9月15日 +0800 AM9:31,dev@flink.apache.org,写道:
> > >
> > > Congratulations! 
> >
>


-- 

long is the way and hard  that out of Hell leads up to light



[ANNOUNCE] New Apache Flink Committer - Arvid Heise

2020-09-14 Thread Zhijiang
Hi all,

On behalf of the PMC, I’m very happy to announce Arvid Heise as a new Flink 
committer.

Arvid has been an active community member for more than a year, with 138 
contributions including 116 commits, reviewed many PRs with good quality 
comments.
He is mainly working on the runtime scope, involved in critical features like 
task mailbox model and unaligned checkpoint, etc.
Besides that, he was super active to reply questions in the user mail list (34 
emails in March, 51 emails in June, etc), also active in dev mail list and Jira 
issue discussions.

Please join me in congratulating Arvid for becoming a Flink committer!

Best,
Zhijiang

Re: [DISCUSS] Releasing Flink 1.11.2

2020-09-02 Thread Zhijiang
Thanks for launching this discussion and volunteering as the release manager. 
+1 on my side and I am willing to provide any help during the release 
procedure, :)


Best,
Zhijiang


--
From:Konstantin Knauf 
Send Time:2020年9月2日(星期三) 23:44
To:dev 
Cc:khachatryan.roman 
Subject:Re: [DISCUSS] Releasing Flink 1.11.2

I think it would be nice to include a fix for
https://issues.apache.org/jira/browse/FLINK-18934, too, as it affects a
highly requested feature of Flink 1.11 quite severely.

On Wed, Sep 2, 2020 at 2:51 PM Till Rohrmann  wrote:

> Thanks a lot for starting this discussion Zhu Zhu and for volunteering as
> the release manager. Big +1 for creating the next 1.11 bug fix release. I
> think we already collected quite a bit of fixes which our users will
> benefit from.
>
> For the pending fixes, I would suggest setting a soft deadline (maybe until
> beginning of next week) and then starting to cut the release (given that no
> other blocker issues pop up). Maybe we are able to resolve the issues even
> earlier which would allow us to cut the release also earlier.
>
> From my side I would like to include FLINK-18959 in the release. But it is
> not a strict release blocker.
>
> Cheers,
> Till
>
> On Wed, Sep 2, 2020 at 2:40 PM David Anderson 
> wrote:
>
> > I think it's worth considering whether we can get this bugfix included in
> > 1.11.2:
> >
> > - FLINK-19109 Split Reader eats chained periodic watermarks
> >
> > There is a PR, but it's still a work in progress. Cc'ing Roman, who has
> > been working on this.
> >
> > Regards,
> > David
> >
> >
> > On Wed, Sep 2, 2020 at 2:19 PM Zhu Zhu  wrote:
> >
> > > Hi All,
> > >
> > > It has been about 1 month since we released Flink 1.11.1. It's not too
> > far
> > > but
> > > we already have more than 80 resolved improvements/bugs in the
> > release-1.11
> > > branch. Some of them are quite critical. Therefore, I propose to create
> > the
> > > next
> > > bugfix release 1.11.2 for Flink 1.11.
> > >
> > > Most noticeable fixes are:
> > > - FLINK-18769 MiniBatch doesn't work with FLIP-95 source
> > > - FLINK-18902 Cannot serve results of asynchronous REST operations in
> > > per-job mode
> > > - FLINK-18682 Vector orc reader cannot read Hive 2.0.0 table
> > > - FLINK-18608 CustomizedConvertRule#convertCast drops nullability
> > > - FLINK-18646 Managed memory released check can block RPC thread
> > > - FLINK-18993 Invoke sanityCheckTotalFlinkMemory method incorrectly in
> > > JobManagerFlinkMemoryUtils.java
> > > - FLINK-18663 RestServerEndpoint may prevent server shutdown
> > > - FLINK-18595 Deadlock during job shutdown
> > > - FLINK-18581 Cannot find GC cleaner with java version previous
> > > jdk8u72(-b01)
> > > - FLINK-17075 Add task status reconciliation between TM and JM
> > >
> > > Furthermore, I think the following blocker issue should be merged
> before
> > > 1.11.2 release
> > >
> > > - FLINK-19121 Avoid accessing HDFS frequently in HiveBulkWriterFactory
> > >
> > > I would volunteer as the release manager and kick off the release
> > process.
> > > What do you think?
> > >
> > > Please let me know if there are any concerns or any missing blocker
> > issues
> > > need to be fixed in 1.11.2.
> > >
> > > Thanks,
> > > Zhu Zhu
> > >
> >
>


-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk



Re: Re: [ANNOUNCE] New PMC member: Dian Fu

2020-08-27 Thread Zhijiang
Congrats, Dian!


--
From:Yun Gao 
Send Time:2020年8月27日(星期四) 17:44
To:dev ; Dian Fu ; user 
; user-zh 
Subject:Re: Re: [ANNOUNCE] New PMC member: Dian Fu

Congratulations Dian !

 Best
 Yun


--
Sender:Marta Paes Moreira
Date:2020/08/27 17:42:34
Recipient:Yuan Mei
Cc:Xingbo Huang; jincheng sun; 
dev; Dian Fu; 
user; user-zh
Theme:Re: [ANNOUNCE] New PMC member: Dian Fu

Congrats, Dian!
On Thu, Aug 27, 2020 at 11:39 AM Yuan Mei  wrote:

Congrats!
On Thu, Aug 27, 2020 at 5:38 PM Xingbo Huang  wrote:

Congratulations Dian!

Best,
Xingbo
jincheng sun  于2020年8月27日周四 下午5:24写道:

Hi all,

On behalf of the Flink PMC, I'm happy to announce that Dian Fu is now part of 
the Apache Flink Project Management Committee (PMC).

Dian Fu has been very active on PyFlink component, working on various important 
features, such as the Python UDF and Pandas integration, and keeps checking and 
voting for our releases, and also has successfully produced two 
releases(1.9.3&1.11.1) as RM, currently working as RM to push forward the 
release of Flink 1.12.

Please join me in congratulating Dian Fu for becoming a Flink PMC Member!

Best,
Jincheng(on behalf of the Flink PMC)



Re: [ANNOUNCE] Apache Flink 1.10.2 released

2020-08-27 Thread Zhijiang
Congrats, thanks for the release manager work Zhu Zhu and everyone involved in!

Best,
Zhijiang
--
From:liupengcheng 
Send Time:2020年8月26日(星期三) 19:37
To:dev ; Xingbo Huang 
Cc:Guowei Ma ; user-zh ; Yangze 
Guo ; Dian Fu ; Zhu Zhu 
; user 
Subject:Re: [ANNOUNCE] Apache Flink 1.10.2 released

Thanks ZhuZhu for managing this release and everyone who contributed to this.

Best,
Pengcheng

 在 2020/8/26 下午7:06,“Congxian Qiu” 写入:

Thanks ZhuZhu for managing this release and everyone else who contributed
to this release!

Best,
Congxian


Xingbo Huang  于2020年8月26日周三 下午1:53写道:

> Thanks Zhu for the great work and everyone who contributed to this 
release!
>
> Best,
> Xingbo
>
> Guowei Ma  于2020年8月26日周三 下午12:43写道:
>
>> Hi,
>>
>> Thanks a lot for being the release manager Zhu Zhu!
>> Thanks everyone contributed to this!
>>
>> Best,
>> Guowei
>>
>>
>> On Wed, Aug 26, 2020 at 11:18 AM Yun Tang  wrote:
>>
>>> Thanks for Zhu's work to manage this release and everyone who
>>> contributed to this!
>>>
>>> Best,
>>> Yun Tang
>>> 
>>> From: Yangze Guo 
>>> Sent: Tuesday, August 25, 2020 14:47
>>> To: Dian Fu 
>>> Cc: Zhu Zhu ; dev ; user <
>>> u...@flink.apache.org>; user-zh 
>>> Subject: Re: [ANNOUNCE] Apache Flink 1.10.2 released
>>>
>>> Thanks a lot for being the release manager Zhu Zhu!
>>> Congrats to all others who have contributed to the release!
>>>
>>> Best,
>>> Yangze Guo
>>>
>>> On Tue, Aug 25, 2020 at 2:42 PM Dian Fu  wrote:
>>> >
>>> > Thanks ZhuZhu for managing this release and everyone else who
>>> contributed to this release!
>>> >
>>> > Regards,
>>> > Dian
>>> >
>>> > 在 2020年8月25日,下午2:22,Till Rohrmann  写道:
>>> >
>>> > Great news. Thanks a lot for being our release manager Zhu Zhu and to
>>> all others who have contributed to the release!
>>> >
>>> > Cheers,
>>> > Till
>>> >
>>> > On Tue, Aug 25, 2020 at 5:37 AM Zhu Zhu  wrote:
>>> >>
>>> >> The Apache Flink community is very happy to announce the release of
>>> Apache Flink 1.10.2, which is the first bugfix release for the Apache 
Flink
>>> 1.10 series.
>>> >>
>>> >> Apache Flink(r) is an open-source stream processing framework for
>>> distributed, high-performing, always-available, and accurate data 
streaming
>>> applications.
>>> >>
>>> >> The release is available for download at:
>>> >> https://flink.apache.org/downloads.html
>>> >>
>>> >> Please check out the release blog post for an overview of the
>>> improvements for this bugfix release:
>>> >> https://flink.apache.org/news/2020/08/25/release-1.10.2.html
>>> >>
>>> >> The full release notes are available in Jira:
>>> >>
>>> 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12347791
>>> >>
>>> >> We would like to thank all contributors of the Apache Flink community
>>> who made this release possible!
>>> >>
>>> >> Thanks,
>>> >> Zhu
>>> >
>>> >
>>>
>>



Re: [ANNOUNCE] New Flink Committer: David Anderson

2020-08-19 Thread Zhijiang
Congratulations David!


--
From:Jeff Zhang 
Send Time:2020年8月19日(星期三) 23:34
To:dev 
Subject:Re: [ANNOUNCE] New Flink Committer: David Anderson

Congratulations David!

Kostas Kloudas  于2020年8月19日周三 下午11:32写道:

> Congratulations David!
>
> Kostas
>
> On Wed, Aug 19, 2020 at 2:33 PM Arvid Heise  wrote:
> >
> > Congrats David!
> >
> > On Wed, Aug 19, 2020 at 11:17 AM Fabian Hueske 
> wrote:
> >
> > > Congrats David, well deserved!
> > >
> > > Cheers,
> > > Fabian
> > >
> > > Am Mi., 19. Aug. 2020 um 11:05 Uhr schrieb Marta Paes Moreira <
> > > ma...@ververica.com>:
> > >
> > > > Congrats, David! Thanks for being the Flink Stack Overflow hawk (on
> top
> > > of
> > > > everything else, of course)!
> > > >
> > > > Marta
> > > >
> > > > On Thu, Aug 13, 2020 at 5:26 AM Roc Marshal  wrote:
> > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Congratulations David!
> > > > >
> > > > >
> > > > > Best,
> > > > > Roc Marshal.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > At 2020-08-12 15:50:47, "Robert Metzger" 
> wrote:
> > > > > >Hi everyone,
> > > > > >
> > > > > >On behalf of the PMC, I'm very happy to announce David Anderson
> as a
> > > new
> > > > > >Apache
> > > > > >Flink committer.
> > > > > >
> > > > > >David has been a Flink community member for a long time. His first
> > > > commit
> > > > > >dates back to 2016, code changes mostly involve the
> documentation, in
> > > > > >particular with the recent contribution of Flink training
> materials.
> > > > > >Besides that, David has been giving numerous talks and trainings
> on
> > > > Flink.
> > > > > >On StackOverflow, he's among the most active helping Flink users
> to
> > > > solve
> > > > > >their problems (2nd in the all-time ranking, 1st in the last 30
> > > days). A
> > > > > >similar level of activity can be found on the user@ mailing list.
> > > > > >
> > > > > >Please join me in congratulating David for becoming a Flink
> committer!
> > > > > >
> > > > > >Best,
> > > > > >Robert
> > > > >
> > > >
> > >
> >
> >
> > --
> >
> > Arvid Heise | Senior Java Developer
> >
> > 
> >
> > Follow us @VervericaData
> >
> > --
> >
> > Join Flink Forward  - The Apache Flink
> > Conference
> >
> > Stream Processing | Event Driven | Real Time
> >
> > --
> >
> > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> >
> > --
> > Ververica GmbH
> > Registered at Amtsgericht Charlottenburg: HRB 158244 B
> > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
> > (Toni) Cheng
>


-- 
Best Regards

Jeff Zhang



[jira] [Created] (FLINK-19003) Add micro-benchmark for unaligned checkpoints

2020-08-19 Thread Zhijiang (Jira)
Zhijiang created FLINK-19003:


 Summary: Add micro-benchmark for unaligned checkpoints
 Key: FLINK-19003
 URL: https://issues.apache.org/jira/browse/FLINK-19003
 Project: Flink
  Issue Type: Task
  Components: Benchmarks, Runtime / Checkpointing
Reporter: Zhijiang
Assignee: Zhijiang


It is necessary to supplement the unaligned checkpoint benchmark to verify our 
following  improvements or any effect in future. 

The benchmark should cover both remote and local channels separately for 
different code paths, and it also needs to guarantee there are some in-flight 
buffers during checkpoint for measuring the channel state snapshot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] New Flink Committer: David Anderson

2020-08-12 Thread Zhijiang
Congrats, David!


--
From:Zhu Zhu 
Send Time:2020年8月12日(星期三) 17:08
To:dev 
Subject:Re: [ANNOUNCE] New Flink Committer: David Anderson

Congratulations, David!

Paul Lam  于2020年8月12日周三 下午4:49写道:

> Congrats, David!
>
> Best,
> Paul Lam
>
> > 2020年8月12日 16:46,Benchao Li  写道:
> >
> > Congratulations!  David
> >
> > Leonard Xu  于2020年8月12日周三 下午4:01写道:
> >
> >> Congratulations!  David
> >>
> >> Best
> >> Leonard
> >>> 在 2020年8月12日,15:59,Till Rohrmann  写道:
> >>>
> >>> Congratulations, David!
> >>
> >>
> >
> > --
> >
> > Best,
> > Benchao Li
>
>



Re: [DISCUSS] Releasing Flink 1.10.2

2020-08-07 Thread Zhijiang
Thanks for volunteering as the release manager, zhuzhu. 

+1 for the 1.10.2 release and I am willing to provide any help in the procedure 
if needing PMC permission operations. 

Best,
Zhijiang


--
From:Robert Metzger 
Send Time:2020年8月7日(星期五) 14:16
To:dev 
Subject:Re: [DISCUSS] Releasing Flink 1.10.2

Thanks for taking care of this Zhu Zhu. The list of bugs from your list
certainly justifies pushing out a bugfix release.
I would propose to wait until Monday for people to speak up if they want to
have a fix included in the release. Otherwise, we could create the first RC
on Monday evening (China time).





On Thu, Aug 6, 2020 at 2:53 PM Till Rohrmann  wrote:

> Thanks for kicking this discussion off Zhu Zhu. +1 for the 1.10.2 release.
> Also thanks for volunteering as the release manager!
>
> Cheers,
> Till
>
> On Thu, Aug 6, 2020 at 1:26 PM Zhu Zhu  wrote:
>
> > Hi All,
> >
> > It has been more than 2 months since we released Flink 1.10.1. We already
> > have more than 60 resolved improvements/bugs in the release-1.10 branch.
> > Therefore, I propose to create the next bugfix release 1.10.2 for Flink
> > 1.10.
> >
> > Most noticeable fixes are:
> > - FLINK-18663 RestServerEndpoint may prevent server shutdown
> > - FLINK-18595 Deadlock during job shutdown
> > - FLINK-18539 StreamExecutionEnvironment#addSource(SourceFunction,
> > TypeInformation) doesn't use the user defined type information
> > - FLINK-18048 "--host" option could not take effect for standalone
> > application cluster
> > - FLINK-18045 Fix Kerberos credentials checking to unblock Flink on
> secured
> > MapR
> > - FLINK-18035 Executors#newCachedThreadPool could not work as expected
> > - FLINK-18012 Deactivate slot timeout if TaskSlotTable.tryMarkSlotActive
> is
> > called
> > - FLINK-17800 RocksDB optimizeForPointLookup results in missing time
> > windows
> > - FLINK-17558 Partitions are released in TaskExecutor Main Thread
> > - FLINK-17466 toRetractStream doesn't work correctly with Pojo conversion
> > class
> > - FLINK-16451 Fix IndexOutOfBoundsException for DISTINCT AGG with
> constants
> >
> > There is no known blocker issue of 1.10.2 release at the moment.
> >
> > I would volunteer as the release manager and kick off the release
> process.
> > What do you think?
> >
> > Please let me know if there are any concerns or any missing blocker
> issues
> > need to be fixed in 1.10.2.
> >
> > Thanks,
> > Zhu Zhu
> >
>



Re: [DISCUSS] Planning Flink 1.12

2020-08-06 Thread Zhijiang
+1 on my side for feature freeze date by the end of Oct.


--
From:Yuan Mei 
Send Time:2020年8月6日(星期四) 14:54
To:dev 
Subject:Re: [DISCUSS] Planning Flink 1.12

+1

> +1 for extending the feature freeze date to the end of October.



On Thu, Aug 6, 2020 at 12:08 PM Yu Li  wrote:

> +1 for extending feature freeze date to end of October.
>
> Feature development in the master branch could be unblocked through
> creating the release branch, but every coin has its two sides (smile)
>
> Best Regards,
> Yu
>
>
> On Wed, 5 Aug 2020 at 20:12, Robert Metzger  wrote:
>
> > Thanks all for your opinion.
> >
> > @Chesnay: That is a risk, but I hope the people responsible for
> individual
> > FLIPs plan accordingly. Extending the time till the feature freeze should
> > not mean that we are extending the scope of the release.
> > Ideally, features are done before FF, and they use the time till the
> freeze
> > for additional testing and documentation polishing.
> > This FF will be virtual, there should be less disruption than a physical
> > conference with all the travelling.
> > Do you have a different proposal for the timing?
> >
> >
> > I'm currently considering splitting the feature freeze and the release
> > branch creation. Similar to the Linux kernel development, we could have a
> > "merge window" and a stabilization phase. At the end of the stabilization
> > phase, we cut the release branch and open the next merge window (I'll
> start
> > a separate thread regarding this towards the end of this release cycle,
> if
> > I still like the idea then)
> >
> >
> > On Wed, Aug 5, 2020 at 12:04 PM Chesnay Schepler 
> > wrote:
> >
> > > I'm a bit concerned about end of October, because it means we have
> Flink
> > > forward, which usually means at least 1 week of little-to-no activity,
> > > and then 1 week until feature-freeze.
> > >
> > > On 05/08/2020 11:56, jincheng sun wrote:
> > > > +1 for end of October from me as well.
> > > >
> > > > Best,
> > > > Jincheng
> > > >
> > > >
> > > > Kostas Kloudas  于2020年8月5日周三 下午4:59写道:
> > > >
> > > >> +1 for end of October from me as well.
> > > >>
> > > >> Cheers,
> > > >> Kostas
> > > >>
> > > >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann 
> > > wrote:
> > > >>
> > > >>> +1 for end of October from my side as well.
> > > >>>
> > > >>> Cheers,
> > > >>> Till
> > > >>>
> > > >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen 
> > wrote:
> > > >>>
> > >  The end of October sounds good from my side, unless it collides
> with
> > > >> some
> > >  holidays that affect many committers.
> > > 
> > >  Feature-wise, I believe we can definitely make good use of the
> time
> > to
> > > >>> wrap
> > >  up some critical threads (like finishing the FLIP-27 source
> > efforts).
> > > 
> > >  So +1 to the end of October from my side.
> > > 
> > >  Best,
> > >  Stephan
> > > 
> > > 
> > >  On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <
> rmetz...@apache.org>
> > > >>> wrote:
> > > > Thanks a lot for commenting on the feature freeze date.
> > > >
> > > > You are raising a few good points on the timing.
> > > > If we have already (2 months before) concerns regarding the
> > deadline,
> > >  then
> > > > I agree that we should move it till the end of October.
> > > >
> > > > We then just need to be careful not to run into the Christmas
> > season
> > > >> at
> > >  the
> > > > end of December.
> > > >
> > > > If nobody objects within a few days, I'll update the feature
> freeze
> > > >>> date
> > >  in
> > > > the Wiki.
> > > >
> > > >
> > > > On Tue, Aug 4, 2020 at 7:52 AM Kurt Young 
> > wrote:
> > > >
> > > >> Regarding setting the feature freeze date to late September, I
> > have
> > >  some
> > > >> concern that it might make
> > > >> the development time of 1.12 too short.
> > > >>
> > > >> One reason for this is we took too much time (about 1.5 month,
> > from
> > > >>> mid
> > > > of
> > > >> May to beginning of July)
> > > >> for testing 1.11. It's not ideal but further squeeze the
> > > >> development
> > >  time
> > > >> of 1.12 won't make this better.
> > > >>   Besides, AFAIK July & August is also a popular vacation season
> > for
> > > >> European. Given the fact most
> > > >>   committers of Flink come from Europe, I think we should also
> > take
> > > >>> this
> > > >> into consideration.
> > > >>
> > > >> It's also true that the first week of October is the national
> > > >> holiday
> > >  of
> > > >> China, so I'm wondering whether the
> > > >> end of October could be a candidate feature freeze date.
> > > >>
> > > >> Best,
> > > >> Kurt
> > > >>
> > > >>
> > > >> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <
> > > >> rmetz...@apache.org>
> > > >> wrote:
> > > >>
> > > >>> Hi all,
> 

Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Zhijiang
Thanks for being the release manager and the efficient work, Dian!

Best,
Zhijiang


--
From:Konstantin Knauf 
Send Time:2020年7月22日(星期三) 19:55
To:Till Rohrmann 
Cc:dev ; Yangze Guo ; Dian Fu 
; user ; user-zh 

Subject:Re: [ANNOUNCE] Apache Flink 1.11.1 released

Thank you for managing the quick follow up release. I think this was very 
important for Table & SQL users.
On Wed, Jul 22, 2020 at 1:45 PM Till Rohrmann  wrote:

Thanks for being the release manager for the 1.11.1 release, Dian. Thanks a lot 
to everyone who contributed to this release.

Cheers,
Till
On Wed, Jul 22, 2020 at 11:38 AM Hequn Cheng  wrote:
Thanks Dian for the great work and thanks to everyone who makes this
 release possible!

 Best, Hequn

 On Wed, Jul 22, 2020 at 4:40 PM Jark Wu  wrote:

 > Congratulations! Thanks Dian for the great work and to be the release
 > manager!
 >
 > Best,
 > Jark
 >
 > On Wed, 22 Jul 2020 at 15:45, Yangze Guo  wrote:
 >
 > > Congrats!
 > >
 > > Thanks Dian Fu for being release manager, and everyone involved!
 > >
 > > Best,
 > > Yangze Guo
 > >
 > > On Wed, Jul 22, 2020 at 3:14 PM Wei Zhong 
 > wrote:
 > > >
 > > > Congratulations! Thanks Dian for the great work!
 > > >
 > > > Best,
 > > > Wei
 > > >
 > > > > 在 2020年7月22日,15:09,Leonard Xu  写道:
 > > > >
 > > > > Congratulations!
 > > > >
 > > > > Thanks Dian Fu for the great work as release manager, and thanks
 > > everyone involved!
 > > > >
 > > > > Best
 > > > > Leonard Xu
 > > > >
 > > > >> 在 2020年7月22日,14:52,Dian Fu  写道:
 > > > >>
 > > > >> The Apache Flink community is very happy to announce the release of
 > > Apache Flink 1.11.1, which is the first bugfix release for the Apache
 > Flink
 > > 1.11 series.
 > > > >>
 > > > >> Apache Flink(r) is an open-source stream processing framework for
 > > distributed, high-performing, always-available, and accurate data
 > streaming
 > > applications.
 > > > >>
 > > > >> The release is available for download at:
 > > > >> https://flink.apache.org/downloads.html
 > > > >>
 > > > >> Please check out the release blog post for an overview of the
 > > improvements for this bugfix release:
 > > > >> https://flink.apache.org/news/2020/07/21/release-1.11.1.html
 > > > >>
 > > > >> The full release notes are available in Jira:
 > > > >>
 > >
 > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12348323
 > > > >>
 > > > >> We would like to thank all contributors of the Apache Flink
 > community
 > > who made this release possible!
 > > > >>
 > > > >> Regards,
 > > > >> Dian
 > > > >
 > > >
 > >
 >


-- 
Konstantin Knauf 
https://twitter.com/snntrable
https://github.com/knaufk 



Re: [VOTE] Release 1.11.1, release candidate #1

2020-07-20 Thread Zhijiang
+1 (binding)

- Checked checksums and GPG files matching: OK
- Verified that the source does not contain any binaries: OK
- Checked that all the POMs point to the same version: OK
- Build the source with Maven: OK
- Review the Web PR: OK
- Checked the artifacts in repo: OK
- Execute the WordCount example by starting the cluster: Success, Web UI OK, 
logs OK.


Best,
Zhijiang


--
From:Yu Li 
Send Time:2020年7月20日(星期一) 19:57
To:dev 
Subject:Re: [VOTE] Release 1.11.1, release candidate #1

+1 (binding)

- Checked diff to last RC: OK (
https://github.com/apache/flink/compare/release-1.11.0...release-1.11.1-rc1)
  - All dependency related changes are properly documented
- Checked release notes: OK
  - Minor: there're 2 open issues left, please remember to scope them out
if this vote passes (the website PR already excludes them)
- Checked sums and signatures: OK
- Source release
 - contains no binaries: OK
 - contains no 1.11-SNAPSHOT references: OK
 - build from source: OK (8u101)
 - mvn clean verify: OK (8u101)
- Binary release
 - no examples appear to be missing
 - started a cluster, WebUI reachable, several streaming and batch
examples ran successfully (11.0.4)
- Repository appears to contain all expected artifacts
- Website PR looks good

Best Regards,
Yu


On Mon, 20 Jul 2020 at 18:00, Jark Wu  wrote:

> Thanks Dian for kicking off the RC.
>
> +1 from my side:
>
> I heavily tested CDC use cases end-to-end and it works well.
>
> - checked/verified signatures and hashes
> - manually verified the diff pom and NOTICE files between 1.11.0 and 1.11.1
> to check dependencies, looks good
> - no missing artifacts in release staging area compared to the 1.11.0
> release
> - started cluster and ran some table examples, verified web ui and log
> output, nothing unexpected
> - started cluster to run e2e SQL queries with millions of records with
> Kafka, MySQL, Elasticsearch as sources/lookup/sinks. Works well and the
> results are as expected.
> - use SQL CLI to read from Kafka with debezium data, and MySQL binlog
> source, and write into MySQL and Elasticsearch. Nothing unexpected
> - review the release PR
>
> Best,
> Jark
>
> On Mon, 20 Jul 2020 at 13:55, Congxian Qiu  wrote:
>
> > Hi  Dian
> >
> > Thanks for the information.
> >
> > Best,
> > Congxian
> >
> >
> > Dian Fu  于2020年7月20日周一 上午11:44写道:
> >
> > > Hi Congxian,
> > >
> > > FLINK-18544 was to fix an issue introduced in 1.11.1 and so it should
> not
> > > appear in the release note according to the release guide[1].
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release#CreatingaFlinkRelease-ReviewReleaseNotesinJIRA
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release#CreatingaFlinkRelease-ReviewReleaseNotesinJIRA
> > > >
> > >
> > > Regards,
> > > Dian
> > >
> > > > 在 2020年7月20日,上午11:32,Dian Fu  写道:
> > > >
> > > > +1 (non-binding)
> > > >
> > > > - checked the checksum and signature
> > > > - installed PyFlink package on MacOS and run some tests
> > > >
> > > > Regards,
> > > > Dian
> > > >
> > > >> 在 2020年7月20日,上午11:11,Congxian Qiu  > > qcx978132...@gmail.com>> 写道:
> > > >>
> > > >> +1 (non-binding)
> > > >>
> > > >> I found that the fix version of FLINK-18544 is 1.11.1, and the
> release
> > > did not contain it. I think we should fix it in the release note.
> > > >>
> > > >> checked
> > > >> - build from source, ok
> > > >> - sha512 sum, ok
> > > >> - gpg key, ok
> > > >> - License seem ok(checked the change of all pom.xml between 1.11.0
> and
> > > 1.11.1
> > >
> >
> https://github.com/apache/flink/compare/release-1.11.0..release-1.11.1-rc1
> > > <
> > >
> >
> https://github.com/apache/flink/compare/release-1.11.0..release-1.11.1-rc1
> > > >
> > > >> - Run some demo locally
> > > >>
> > > >> Best,
> > > >> Congxian
> > > >>
> > > >>
> > > >> Rui Li mailto:lirui.fu...@gmail.com>>
> > > 于2020年7月18日周六 下午7:04写道:
> > > >> +1 (non-binding)
> > > >>
> > > >> - Built from source
> > > >> - Verified hive connector tests for all hive versions
> > >

Re: Kinesis Performance Issue (was [VOTE] Release 1.11.0, release candidate #4)

2020-07-16 Thread Zhijiang
Hi Thomas,

Thanks for your further profiling information and glad to see we already 
finalized the location to cause the regression. 
Actually I was also suspicious of the point of #snapshotState in previous 
discussions since it indeed cost much time to block normal operator processing.

Based on your below feedback, the sleep time during #snapshotState might be the 
main concern, and I also digged into the implementation of 
FlinkKinesisProducer#snapshotState.
while (producer.getOutstandingRecordsCount() > 0) {
   producer.flush();
   try {
  Thread.sleep(500);
   } catch (InterruptedException e) {
  LOG.warn("Flushing was interrupted.");
  break;
   }
}
It seems that the sleep time is mainly affected by the internal operations 
inside KinesisProducer implementation provided by amazonaws, which I am not 
quite familiar with. 
But I noticed there were two upgrades related to it in release-1.11.0. One is 
for upgrading amazon-kinesis-producer to 0.14.0 [1] and another is for 
upgrading aws-sdk-version to 1.11.754 [2].
You mentioned that you already reverted the SDK upgrade to verify no changes. 
Did you also revert the [1] to verify?
[1] https://issues.apache.org/jira/browse/FLINK-17496
[2] https://issues.apache.org/jira/browse/FLINK-14881

Best,
Zhijiang
--
From:Thomas Weise 
Send Time:2020年7月17日(星期五) 05:29
To:dev 
Cc:Zhijiang ; Stephan Ewen ; 
Arvid Heise ; Aljoscha Krettek 
Subject:Re: Kinesis Performance Issue (was [VOTE] Release 1.11.0, release 
candidate #4)

Sorry for the delay.

I confirmed that the regression is due to the sink (unsurprising, since
another job with the same consumer, but not the producer, runs as expected).

As promised I did CPU profiling on the problematic application, which gives
more insight into the regression [1]

The screenshots show that the average time for snapshotState increases from
~9s to ~28s. The data also shows the increase in sleep time during
snapshotState.

Does anyone, based on changes made in 1.11, have a theory why?

I had previously looked at the changes to the Kinesis connector and also
reverted the SDK upgrade, which did not change the situation.

It will likely be necessary to drill into the sink / checkpointing details
to understand the cause of the problem.

Let me know if anyone has specific questions that I can answer from the
profiling results.

Thomas

[1]
https://docs.google.com/presentation/d/159IVXQGXabjnYJk3oVm3UP2UW_5G-TGs_u9yzYb030I/edit?usp=sharing

On Mon, Jul 13, 2020 at 11:14 AM Thomas Weise  wrote:

> + dev@ for visibility
>
> I will investigate further today.
>
>
> On Wed, Jul 8, 2020 at 4:42 AM Aljoscha Krettek 
> wrote:
>
>> On 06.07.20 20:39, Stephan Ewen wrote:
>> >- Did sink checkpoint notifications change in a relevant way, for
>> example
>> > due to some Kafka issues we addressed in 1.11 (@Aljoscha maybe?)
>>
>> I think that's unrelated: the Kafka fixes were isolated in Kafka and the
>> one bug I discovered on the way was about the Task reaper.
>>
>>
>> On 07.07.20 17:51, Zhijiang wrote:
>> > Sorry for my misunderstood of the previous information, Thomas. I was
>> assuming that the sync checkpoint duration increased after upgrade as it
>> was mentioned before.
>> >
>> > If I remembered correctly, the memory state backend also has the same
>> issue? If so, we can dismiss the rocksDB state changes. As the slot sharing
>> enabled, the downstream and upstream should
>> > probably deployed into the same slot, then no network shuffle effect.
>> >
>> > I think we need to find out whether it has other symptoms changed
>> besides the performance regression to further figure out the scope.
>> > E.g. any metrics changes, the number of TaskManager and the number of
>> slots per TaskManager from deployment changes.
>> > 40% regression is really big, I guess the changes should also be
>> reflected in other places.
>> >
>> > I am not sure whether we can reproduce the regression in our AWS
>> environment by writing any Kinesis jobs, since there are also normal
>> Kinesis jobs as Thomas mentioned after upgrade.
>> > So it probably looks like to touch some corner case. I am very willing
>> to provide any help for debugging if possible.
>> >
>> >
>> > Best,
>> > Zhijiang
>> >
>> >
>> > --
>> > From:Thomas Weise 
>> > Send Time:2020年7月7日(星期二) 23:01
>> > To:Stephan Ewen 
>> > Cc:Aljoscha Krettek ; Arvid Heise <
>> ar...@ververica.com>; Zhijiang 
>> > Subject:Re: Kinesis Performance Issue (was [VOTE] Release 1.11.0,
>> release

[jira] [Created] (FLINK-18612) WordCount example failure when using relative output path

2020-07-16 Thread Zhijiang (Jira)
Zhijiang created FLINK-18612:


 Summary: WordCount example failure when using relative output path
 Key: FLINK-18612
 URL: https://issues.apache.org/jira/browse/FLINK-18612
 Project: Flink
  Issue Type: Bug
  Components: fs
Affects Versions: 1.11.0, 1.11.1
Reporter: Zhijiang
 Fix For: 1.12.0, 1.11.2


The failure log can be found here 
[log|https://pipelines.actions.githubusercontent.com/revSbsLpzrFApLL6BmCvScWt72tRe3wYUv7fCdCtThtI5bydk7/_apis/pipelines/1/runs/27244/signedlogcontent/21?urlExpires=2020-07-16T06%3A35%3A49.4559813Z=HMACV1=%2FfAsJgIlIf%2BDitViRJYh0DAGJZjJwhsCGS219ZyniAA%3D].

When execute the following command, we can reproduce this problem locally.
* bin/start-cluster.sh
* bin/flink run -p 1 examples/streaming/WordCount.jar --input input --output 
result

It is caused by the 
[commit|https://github.com/apache/flink/commit/a2deff2967b7de423b10f7f01a41c06565c37e62#diff-2010e422f5e43a971cd7134a9e0b9a5f
 ].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18591) Fix the format issue for metrics web page

2020-07-13 Thread Zhijiang (Jira)
Zhijiang created FLINK-18591:


 Summary: Fix the format issue for metrics web page
 Key: FLINK-18591
 URL: https://issues.apache.org/jira/browse/FLINK-18591
 Project: Flink
  Issue Type: Bug
  Components: Documentation, Runtime / Metrics
Affects Versions: 1.11.0
Reporter: Zhijiang
Assignee: Zhijiang
 Fix For: 1.12.0, 1.11.0


The formatting issue is shown by link 
https://ci.apache.org/projects/flink/flink-docs-release-1.11/monitoring/metrics.html#checkpointing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18552) Update migration tests in master to cover migration from release-1.11

2020-07-10 Thread Zhijiang (Jira)
Zhijiang created FLINK-18552:


 Summary: Update migration tests in master to cover migration from 
release-1.11
 Key: FLINK-18552
 URL: https://issues.apache.org/jira/browse/FLINK-18552
 Project: Flink
  Issue Type: Bug
  Components: Tests
Reporter: Zhijiang
 Fix For: 1.12.0


We should update the following tests to cover migration from release-1.11:
 * {{CEPMigrationTest}}
 * {{BucketingSinkMigrationTest}}
 * {{FlinkKafkaConsumerBaseMigrationTest}}
 * {{ContinuousFileProcessingMigrationTest}}
 * {{WindowOperatorMigrationTest}}
 * {{StatefulJobSavepointMigrationITCase}}
 * {{StatefulJobWBroadcastStateMigrationITCase}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Releasing Flink 1.11.1 soon?

2020-07-09 Thread Zhijiang
Thanks for kickoff the discussion Jark, and thanks for the volunteer of release 
manager Dian.

I am +1 for the proposal of bringing Flink 1.11.1 soon since some critical bug 
fix accumulated.

Regarding the performance issue that Thomas mentioned before, we have not any 
conclusions at the moment. I guess Thomas is still trying out to find some 
potential clues.

Best,
Zhijiang




--
From:Congxian Qiu 
Send Time:2020年7月10日(星期五) 10:30
To:dev@flink.apache.org 
Subject:Re: [DISCUSS] Releasing Flink 1.11.1 soon?

+1 for a quick bug fix release for 1.11

Best,
Congxian


Yu Li  于2020年7月10日周五 上午9:37写道:

> +1, thanks Jark for bringing this up and Dian for volunteering as our
> release manager.
>
> Best Regards,
> Yu
>
>
> On Fri, 10 Jul 2020 at 09:29, Hequn Cheng  wrote:
>
> > +1 for a quick bug fix release and Dian as the release manager.
> >
> > Best,
> > Hequn
> >
> >
> > On Thu, Jul 9, 2020 at 9:22 PM Dian Fu  wrote:
> >
> > > Hi Jark,
> > >
> > > Thanks for offering the help. It would definitely be helpful.
> > >
> > > Regards,
> > > Dian
> > >
> > > > 在 2020年7月9日,下午8:54,Benchao Li  写道:
> > > >
> > > > +1 for a quick bug fix release for 1.11
> > > >
> > > > Aljoscha Krettek  于2020年7月9日周四 下午8:11写道:
> > > >
> > > >> +1
> > > >>
> > > >> I'd also be in favour of releasing a 1.11.1 quickly
> > > >>
> > > >> Aljoscha
> > > >>
> > > >> On 09.07.20 13:57, Jark Wu wrote:
> > > >>> Hi Dian,
> > > >>>
> > > >>> Glad to hear that you want to be the release manager of Flink
> 1.11.1.
> > > >>> I am very willing to help you with the final steps of the release
> > > >> process.
> > > >>>
> > > >>> Best,
> > > >>> Jark
> > > >>>
> > > >>> On Thu, 9 Jul 2020 at 17:57, Jingsong Li 
> > > wrote:
> > > >>>
> > > >>>> FLINK-18461 is really a blocker for the CDC feature.
> > > >>>>
> > > >>>> So +1 for releasing Flink 1.11.1 soon.
> > > >>>>
> > > >>>> Best,
> > > >>>> Jingsong
> > > >>>>
> > > >>>> On Thu, Jul 9, 2020 at 5:34 PM jincheng sun <
> > sunjincheng...@gmail.com
> > > >
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Thanks for bring up this discussion Jark.
> > > >>>>> +1, looking forward the first bugfix version of Flink 1.11.
> > > >>>>>
> > > >>>>> Best,
> > > >>>>> Jincheng
> > > >>>>>
> > > >>>>> Dian Fu  于2020年7月9日周四 下午5:28写道:
> > > >>>>>
> > > >>>>>> Thanks Jark for bringing up this discussion. I also noticed that
> > > there
> > > >>>>> are
> > > >>>>>> already users trying out the CDC feature and so it makes sense
> to
> > > >> have a
> > > >>>>>> quick 1.11.1 release.
> > > >>>>>>
> > > >>>>>> I would volunteer as the release manager of 1.11.1 if we finally
> > > >> decide
> > > >>>>> to
> > > >>>>>> have a quick release. Also +1 to create the first RC on next
> > Monday.
> > > >>>>>>
> > > >>>>>> Thanks,
> > > >>>>>> Dian
> > > >>>>>>
> > > >>>>>>> 在 2020年7月9日,下午3:55,Dawid Wysakowicz 
> 写道:
> > > >>>>>>>
> > > >>>>>>> I do agree it would be beneficial to have the 1.11.1 rather
> soon.
> > > >>>>>>>
> > > >>>>>>> Personally additionally to Jark's list I'd like to see:
> > > >>>>>>>
> > > >>>>>>> FLINK-18419  Can not create a catalog from user jar
> > > >>>>>>> (https://issues.apache.org/jira/browse/FLINK-18419)
> > > >>>>>>>
> > > >>>>>>> incluedd. It has a PR already.
> > > >>>>>>>
> > > >>>>>>> Be

[ANNOUNCE] Apache Flink 1.11.0 released

2020-07-07 Thread Zhijiang
The Apache Flink community is very happy to announce the release of Apache 
Flink 1.11.0, which is the latest major release.

Apache Flink(r) is an open-source stream processing framework for distributed, 
high-performing, always-available, and accurate data streaming applications.

The release is available for download at:
https://flink.apache.org/downloads.html

Please check out the release blog post for an overview of the improvements for 
this new major release:
https://flink.apache.org/news/2020/07/06/release-1.11.0.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12346364

We would like to thank all contributors of the Apache Flink community who made 
this release possible!

Cheers,
Piotr & Zhijiang

Re: [ANNOUNCE] New PMC member: Piotr Nowojski

2020-07-06 Thread Zhijiang
Congratulations Piotr!

Best,
Zhijiang


--
From:Rui Li 
Send Time:2020年7月7日(星期二) 11:55
To:dev 
Cc:pnowojski 
Subject:Re: [ANNOUNCE] New PMC member: Piotr Nowojski

Congrats!

On Tue, Jul 7, 2020 at 11:25 AM Yangze Guo  wrote:

> Congratulations!
>
> Best,
> Yangze Guo
>
> On Tue, Jul 7, 2020 at 11:01 AM Jiayi Liao 
> wrote:
> >
> > Congratulations Piotr!
> >
> > Best,
> > Jiayi Liao
> >
> > On Tue, Jul 7, 2020 at 10:54 AM Jark Wu  wrote:
> >
> > > Congratulations Piotr!
> > >
> > > Best,
> > > Jark
> > >
> > > On Tue, 7 Jul 2020 at 10:50, Yuan Mei  wrote:
> > >
> > > > Congratulations, Piotr!
> > > >
> > > > On Tue, Jul 7, 2020 at 1:07 AM Stephan Ewen 
> wrote:
> > > >
> > > > > Hi all!
> > > > >
> > > > > It is my pleasure to announce that Piotr Nowojski joined the Flink
> PMC.
> > > > >
> > > > > Many of you may know Piotr from the work he does on the data
> processing
> > > > > runtime and the network stack, from the mailing list, or the
> release
> > > > > manager work.
> > > > >
> > > > > Congrats, Piotr!
> > > > >
> > > > > Best,
> > > > > Stephan
> > > > >
> > > >
> > >
>


-- 
Best regards!
Rui Li



[RESULT] [VOTE] Release 1.11.0, release candidate #4

2020-07-06 Thread Zhijiang
I'm happy to announce that we have unanimously approved the 1.11.0 release.

There are 17 approving votes, 8 of which are binding:

- Stephan (binding)
- Till (binding)
- Aljoscha (binding)
- Robert (binding)
- Chesnay (binding)
- Dawid (binding)
- Jark (binding)
- Jincheng (binding)
- Leonard Xu (non-binding)
- Dian Fu (non-binding)
- Xingbo Huang (non-binding)
- Steven Wu (non-binding)
- Congxian Qiu (non-binding)
- Benchao Li (non-binding)
- Xintong Song (non-binding)
- Jingsonga (non-binding)
- Yang Wang (non-binding)

There are no disapproving votes.

Thanks everyone for the hard work and help making this release possible!


Best,
Zhijiang

Re: [VOTE] Release 1.11.0, release candidate #4

2020-07-06 Thread Zhijiang
Hi all,

The vote already lasted for more than 72 hours. Thanks everyone for helping 
test and verify the release. 
I will finalize the vote result soon in a separate email.

Best,
Zhijiang


--
From:Jingsong Li 
Send Time:2020年7月6日(星期一) 12:11
To:dev 
Subject:Re: [VOTE] Release 1.11.0, release candidate #4

+1 (non-binding)

- verified signature and checksum
- build from source
- checked webui and log sanity
- played with filesystem and new connectors
- played with Hive connector

Best,
Jingsonga

On Mon, Jul 6, 2020 at 9:50 AM Xintong Song  wrote:

> +1 (non-binding)
>
> - verified signature and checksum
> - build from source
> - checked log sanity
> - checked webui
> - played with memory configurations
> - played with binding addresses/ports
>
> Thank you~
>
> Xintong Song
>
>
>
> On Sun, Jul 5, 2020 at 9:41 PM Benchao Li  wrote:
>
> > +1 (non-binding)
> >
> > Checks:
> > - verified signature and shasum of release files [OK]
> > - build from source [OK]
> > - started standalone cluster, sql-client [mostly OK except one issue]
> >   - played with sql-client
> >   - played with new features: LIKE / Table Options
> >   - checked Web UI functionality
> >   - canceled job from UI
> >
> > While I'm playing with the new table factories, I found one issue[1]
> which
> > surprises me.
> > I don't think this should be a blocker, hence I'll still vote my +1.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-18487
> >
> > Zhijiang  于2020年7月5日周日 下午1:10写道:
> >
> > > Hi Thomas,
> > >
> > > Regarding [2], it has more detail infos in the Jira description (
> > > https://issues.apache.org/jira/browse/FLINK-16404).
> > >
> > > I can also give some basic explanations here to dismiss the concern.
> > > 1. In the past, the following buffers after the barrier will be cached
> on
> > > downstream side before alignment.
> > > 2. In 1.11, the upstream would not send the buffers after the barrier.
> > > When the downstream finishes the alignment, it will notify the
> downstream
> > > of continuing sending following buffers, since it can process them
> after
> > > alignment.
> > > 3. The only difference is that the temporary blocked buffers are cached
> > > either on downstream side or on upstream side before alignment.
> > > 4. The side effect would be the additional notification cost for every
> > > barrier alignment. If the downstream and upstream are deployed in
> > separate
> > > TaskManager, the cost is network transport delay (the effect can be
> > ignored
> > > based on our testing with 1s checkpoint interval). For sharing slot in
> > your
> > > case, the cost is only one method call in processor, can be ignored
> also.
> > >
> > > You mentioned "In this case, the downstream task has a high average
> > > checkpoint duration(~30s, sync part)." This duration is not reflecting
> > the
> > > changes above, and it is only indicating the duration for calling
> > > `Operation.snapshotState`.
> > > If this duration is beyond your expectation, you can check or debug
> > > whether the source/sink operations might take more time to finish
> > > `snapshotState` in practice. E.g. you can
> > > make the implementation of this method as empty to further verify the
> > > effect.
> > >
> > > Best,
> > > Zhijiang
> > >
> > >
> > > --
> > > From:Thomas Weise 
> > > Send Time:2020年7月5日(星期日) 12:22
> > > To:dev ; Zhijiang 
> > > Cc:Yingjie Cao 
> > > Subject:Re: [VOTE] Release 1.11.0, release candidate #4
> > >
> > > Hi Zhijiang,
> > >
> > > Could you please point me to more details regarding: "[2]: Delay send
> the
> > > following buffers after checkpoint barrier on upstream side until
> barrier
> > > alignment on downstream side."
> > >
> > > In this case, the downstream task has a high average checkpoint
> duration
> > > (~30s, sync part). If there was a change to hold buffers depending on
> > > downstream performance, could this possibly apply to this case (even
> when
> > > there is no shuffle that would require alignment)?
> > >
> > > Thanks,
> > > Thomas
> > >
> > >
> > > On Sat, Jul 4, 2020 at 7:39 AM Zhijiang  > > .invalid>
> > > wrote:
> > 

Re: [VOTE] Release 1.11.0, release candidate #4

2020-07-04 Thread Zhijiang
Hi Thomas,

Regarding [2], it has more detail infos in the Jira description 
(https://issues.apache.org/jira/browse/FLINK-16404). 

I can also give some basic explanations here to dismiss the concern.
1. In the past, the following buffers after the barrier will be cached on 
downstream side before alignment.
2. In 1.11, the upstream would not send the buffers after the barrier. When the 
downstream finishes the alignment, it will notify the downstream of continuing 
sending following buffers, since it can process them after alignment.
3. The only difference is that the temporary blocked buffers are cached either 
on downstream side or on upstream side before alignment.
4. The side effect would be the additional notification cost for every barrier 
alignment. If the downstream and upstream are deployed in separate TaskManager, 
the cost is network transport delay (the effect can be ignored based on our 
testing with 1s checkpoint interval). For sharing slot in your case, the cost 
is only one method call in processor, can be ignored also.

You mentioned "In this case, the downstream task has a high average checkpoint 
duration(~30s, sync part)." This duration is not reflecting the changes above, 
and it is only indicating the duration for calling `Operation.snapshotState`. 
If this duration is beyond your expectation, you can check or debug whether the 
source/sink operations might take more time to finish `snapshotState` in 
practice. E.g. you can
make the implementation of this method as empty to further verify the effect.

Best,
Zhijiang


--
From:Thomas Weise 
Send Time:2020年7月5日(星期日) 12:22
To:dev ; Zhijiang 
Cc:Yingjie Cao 
Subject:Re: [VOTE] Release 1.11.0, release candidate #4

Hi Zhijiang,

Could you please point me to more details regarding: "[2]: Delay send the
following buffers after checkpoint barrier on upstream side until barrier
alignment on downstream side."

In this case, the downstream task has a high average checkpoint duration
(~30s, sync part). If there was a change to hold buffers depending on
downstream performance, could this possibly apply to this case (even when
there is no shuffle that would require alignment)?

Thanks,
Thomas


On Sat, Jul 4, 2020 at 7:39 AM Zhijiang 
wrote:

> Hi Thomas,
>
> Thanks for the further update information.
>
> I guess we can dismiss the network stack changes, since in your case the
> downstream and upstream would probably be deployed in the same slot
> bypassing the network data shuffle.
> Also I guess release-1.11 will not bring general performance regression in
> runtime engine, as we also did the performance testing for all general
> cases by [1] in real cluster before and the testing results should fit the
> expectation. But we indeed did not test the specific source and sink
> connectors yet as I known.
>
> Regarding your performance regression with 40%, I wonder it is probably
> related to specific source/sink changes (e.g. kinesis) or environment
> issues with corner case.
> If possible, it would be helpful to further locate whether the regression
> is caused by kinesis, by replacing the kinesis source & sink and keeping
> the others same.
>
> As you said, it would be efficient to contact with you directly next week
> to further discuss this issue. And we are willing/eager to provide any help
> to resolve this issue soon.
>
> Besides that, I guess this issue should not be the blocker for the
> release, since it is probably a corner case based on the current analysis.
> If we really conclude anything need to be resolved after the final
> release, then we can also make the next minor release-1.11.1 come soon.
>
> [1] https://issues.apache.org/jira/browse/FLINK-18433
>
> Best,
> Zhijiang
>
>
> --
> From:Thomas Weise 
> Send Time:2020年7月4日(星期六) 12:26
> To:dev ; Zhijiang 
> Cc:Yingjie Cao 
> Subject:Re: [VOTE] Release 1.11.0, release candidate #4
>
> Hi Zhijiang,
>
> It will probably be best if we connect next week and discuss the issue
> directly since this could be quite difficult to reproduce.
>
> Before the testing result on our side comes out for your respective job
> case, I have some other questions to confirm for further analysis:
> -  How much percentage regression you found after switching to 1.11?
>
> ~40% throughput decline
>
> -  Are there any network bottleneck in your cluster? E.g. the network
> bandwidth is full caused by other jobs? If so, it might have more effects
> by above [2]
>
> The test runs on a k8s cluster that is also used for other production jobs.
> There is no reason be believe network is the bottleneck.
>
> -  Did you adjust the default network buffer setting? E.g.
> &qu

Re: [VOTE] Release 1.11.0, release candidate #4

2020-07-04 Thread Zhijiang
Hi Thomas,

Thanks for the further update information. 

I guess we can dismiss the network stack changes, since in your case the 
downstream and upstream would probably be deployed in the same slot bypassing 
the network data shuffle. 
Also I guess release-1.11 will not bring general performance regression in 
runtime engine, as we also did the performance testing for all general cases by 
[1] in real cluster before and the testing results should fit the expectation. 
But we indeed did not test the specific source and sink connectors yet as I 
known.

Regarding your performance regression with 40%, I wonder it is probably related 
to specific source/sink changes (e.g. kinesis) or environment issues with 
corner case. 
If possible, it would be helpful to further locate whether the regression is 
caused by kinesis, by replacing the kinesis source & sink and keeping the 
others same.

As you said, it would be efficient to contact with you directly next week to 
further discuss this issue. And we are willing/eager to provide any help to 
resolve this issue soon.

Besides that, I guess this issue should not be the blocker for the release, 
since it is probably a corner case based on the current analysis. 
If we really conclude anything need to be resolved after the final release, 
then we can also make the next minor release-1.11.1 come soon.

[1] https://issues.apache.org/jira/browse/FLINK-18433

Best,
Zhijiang


--
From:Thomas Weise 
Send Time:2020年7月4日(星期六) 12:26
To:dev ; Zhijiang 
Cc:Yingjie Cao 
Subject:Re: [VOTE] Release 1.11.0, release candidate #4

Hi Zhijiang,

It will probably be best if we connect next week and discuss the issue
directly since this could be quite difficult to reproduce.

Before the testing result on our side comes out for your respective job
case, I have some other questions to confirm for further analysis:
-  How much percentage regression you found after switching to 1.11?

~40% throughput decline

-  Are there any network bottleneck in your cluster? E.g. the network
bandwidth is full caused by other jobs? If so, it might have more effects
by above [2]

The test runs on a k8s cluster that is also used for other production jobs.
There is no reason be believe network is the bottleneck.

-  Did you adjust the default network buffer setting? E.g.
"taskmanager.network.memory.floating-buffers-per-gate" or
"taskmanager.network.memory.buffers-per-channel"

The job is using the defaults, i.e we don't configure the settings. If you
want me to try specific settings in the hope that it will help to isolate
the issue please let me know.

-  I guess the topology has three vertexes "KinesisConsumer -> Chained
FlatMap -> KinesisProducer", and the partition mode for "KinesisConsumer ->
FlatMap" and "FlatMap->KinesisProducer" are both "forward"? If so, the edge
connection is one-to-one, not all-to-all, then the above [1][2] should no
effects in theory with default network buffer setting.

There are only 2 vertices and the edge is "forward".

- By slot sharing, I guess these three vertex parallelism task would
probably be deployed into the same slot, then the data shuffle is by memory
queue, not network stack. If so, the above [2] should no effect.

Yes, vertices share slots.

- I also saw some Jira changes for kinesis in this release, could you
confirm that these changes would not effect the performance?

I will need to take a look. 1.10 already had a regression introduced by the
Kinesis producer update.


Thanks,
Thomas


On Thu, Jul 2, 2020 at 11:46 PM Zhijiang 
wrote:

> Hi Thomas,
>
> Thanks for your reply with rich information!
>
> We are trying to reproduce your case in our cluster to further verify it,
> and  @Yingjie Cao is working on it now.
>  As we have not kinesis consumer and producer internally, so we will
> construct the common source and sink instead in the case of backpressure.
>
> Firstly, we can dismiss the rockdb factor in this release, since you also
> mentioned that "filesystem leads to same symptoms".
>
> Secondly, if my understanding is right, you emphasis that the regression
> only exists for the jobs with low checkpoint interval (10s).
> Based on that, I have two suspicions with the network related changes in
> this release:
> - [1]: Limited the maximum backlog value (default 10) in subpartition
> queue.
> - [2]: Delay send the following buffers after checkpoint barrier on
> upstream side until barrier alignment on downstream side.
>
> These changes are motivated for reducing the in-flight buffers to speedup
> checkpoint especially in the case of backpressure.
> In theory they should have very minor performance effect and actually we
> also tested in cluster to verify within expectation before merging the

Re: [VOTE] Release 1.11.0, release candidate #4

2020-07-03 Thread Zhijiang
Hi Thomas,

Thanks for your reply with rich information!

We are trying to reproduce your case in our cluster to further verify it, and  
@Yingjie Cao is working on it now.
 As we have not kinesis consumer and producer internally, so we will construct 
the common source and sink instead in the case of backpressure. 

Firstly, we can dismiss the rockdb factor in this release, since you also 
mentioned that "filesystem leads to same symptoms".

Secondly, if my understanding is right, you emphasis that the regression only 
exists for the jobs with low checkpoint interval (10s). 
Based on that, I have two suspicions with the network related changes in this 
release:
- [1]: Limited the maximum backlog value (default 10) in subpartition 
queue. 
- [2]: Delay send the following buffers after checkpoint barrier on 
upstream side until barrier alignment on downstream side.

These changes are motivated for reducing the in-flight buffers to speedup 
checkpoint especially in the case of backpressure. 
In theory they should have very minor performance effect and actually we also 
tested in cluster to verify within expectation before merging them,
 but maybe there are other corner cases we have not thought of before.

Before the testing result on our side comes out for your respective job case, I 
have some other questions to confirm for further analysis:
-  How much percentage regression you found after switching to 1.11?
-  Are there any network bottleneck in your cluster? E.g. the network 
bandwidth is full caused by other jobs? If so, it might have more effects by 
above [2]
-  Did you adjust the default network buffer setting? E.g. 
"taskmanager.network.memory.floating-buffers-per-gate" or 
"taskmanager.network.memory.buffers-per-channel"
-  I guess the topology has three vertexes "KinesisConsumer -> Chained 
FlatMap -> KinesisProducer", and the partition mode for "KinesisConsumer -> 
FlatMap" and "FlatMap->KinesisProducer" are both "forward"? If so, the edge 
connection is one-to-one, not all-to-all, then the above [1][2] should no 
effects in theory with default network buffer setting.
- By slot sharing, I guess these three vertex parallelism task would 
probably be deployed into the same slot, then the data shuffle is by memory 
queue, not network stack. If so, the above [2] should no effect.
- I also saw some Jira changes for kinesis in this release, could you 
confirm that these changes would not effect the performance?

Best,
Zhijiang


--
From:Thomas Weise 
Send Time:2020年7月3日(星期五) 01:07
To:dev ; Zhijiang 
Subject:Re: [VOTE] Release 1.11.0, release candidate #4

Hi Zhijiang,

The performance degradation manifests in backpressure which leads to
growing backlog in the source. I switched a few times between 1.10 and 1.11
and the behavior is consistent.

The DAG is:

KinesisConsumer -> (Flat Map, Flat Map, Flat Map)    forward
-> KinesisProducer

Parallelism: 160
No shuffle/rebalance.

Checkpointing config:

Checkpointing Mode Exactly Once
Interval 10s
Timeout 10m 0s
Minimum Pause Between Checkpoints 10s
Maximum Concurrent Checkpoints 1
Persist Checkpoints Externally Enabled (delete on cancellation)

State backend: rocksdb  (filesystem leads to same symptoms)
Checkpoint size is tiny (500KB)

An interesting difference to another job that I had upgraded successfully
is the low checkpointing interval.

Thanks,
Thomas


On Wed, Jul 1, 2020 at 9:02 PM Zhijiang 
wrote:

> Hi Thomas,
>
> Thanks for the efficient feedback.
>
> Regarding the suggestion of adding the release notes document, I agree
> with your point. Maybe we should adjust the vote template accordingly in
> the respective wiki to guide the following release processes.
>
> Regarding the performance regression, could you provide some more details
> for our better measurement or reproducing on our sides?
> E.g. I guess the topology only includes two vertexes source and sink?
> What is the parallelism for every vertex?
> The upstream shuffles data to the downstream via rebalance partitioner or
> other?
> The checkpoint mode is exactly-once with rocksDB state backend?
> The backpressure happened in this case?
> How much percentage regression in this case?
>
> Best,
> Zhijiang
>
>
>
> --
> From:Thomas Weise 
> Send Time:2020年7月2日(星期四) 09:54
> To:dev 
> Subject:Re: [VOTE] Release 1.11.0, release candidate #4
>
> Hi Till,
>
> Yes, we don't have the setting in flink-conf.yaml.
>
> Generally, we carry forward the existing configuration and any change to
> default configuration values would impact the upgrade.
>
> Yes, since it is an incompatible change I would state it in the release
> notes

Re: [VOTE] Release 1.11.0, release candidate #4

2020-07-02 Thread Zhijiang
I also agree with Till and Robert's proposals. 

In general I think we should not block the release based on current estimation. 
Otherwise we continuously postpone the release, it might probably occur new 
bugs for blockers, then we might probably
get stuck in such cycle to not give a final release for users in time. But that 
does not mean RC4 would be the final one, and we can reevaluate the effects in 
progress with the accumulated issues.

Regarding the performance regression, if possible we can reproduce to analysis 
the reason based on Thomas's feedback, then we can evaluate its effect.

Regarding the FLINK-18461, after syncing with Jark offline, the bug would 
effect one of three scenarios for using CDC feature, and this effected scenario 
is actually the most commonly used way by users.
My suggestion is to merge it into release-1.11 ATM since the PR already open 
for review, then let's further finalize the conclusion later. If this issue is 
the only one after RC4 going through, then another option is to cover it in 
next release-1.11.1 as Robert suggested, as we can prepare for the next minor 
release soon. If there are other blockers issues during voting and necessary to 
be resolved soon, then it is no doubt to cover all of them in next RC5.

Best,
Zhijiang


--
From:Till Rohrmann 
Send Time:2020年7月2日(星期四) 16:46
To:dev 
Cc:Zhijiang 
Subject:Re: [VOTE] Release 1.11.0, release candidate #4

I agree with Robert.

@Chesnay: The problem has probably already existed in Flink 1.10 and before 
because we cannot run jobs with eager execution calls from the web ui. I agree 
with Robert that we can/should improve our documentation in this regard, though.

@Thomas: 
1. I will update the release notes to add a short section describing that one 
needs to configure the JobManager memory. 
2. Concerning the performance regression we should look into it. I believe 
Zhijiang is very eager to learn more about your exact setup to further debug 
it. Again I agree with Robert to not block the release on it at the moment.

@Jark: How much of a problem is FLINK-18461? Will it make the CDC feature 
completely unusable or will only make a subset of the use cases to not work? If 
it is the latter, then I believe that we can document the limitations and try 
to fix it asap. Depending on the remaining testing the fix might make it into 
the 1.11.0 or the 1.11.1 release.

Cheers,
Till
On Thu, Jul 2, 2020 at 10:33 AM Robert Metzger  wrote:
Thanks a lot for the thorough testing Thomas! This is really helpful!

 @Chesnay: I would not block the release on this. The web submission does
 not seem to be the documented / preferred way of job submission. It is
 unlikely to harm the beginner's experience (and they would anyways not read
 the release notes). I mention the beginner experience, because they are the
 primary audience of the examples.

 Regarding FLINK-18461 / Jark's issue: I would not block the release on
 that, but still try to get it fixed asap. It is likely that this RC doesn't
 go through (given the rate at which we are finding issues), and even if it
 goes through, we can document it as a known issue in the release
 announcement and immediately release 1.11.1.
 Blocking the release on this causes quite a bit of work for the release
 managers for rolling a new RC. Until we have understood the performance
 regression Thomas is reporting, I would keep this RC open, and keep testing.


 On Thu, Jul 2, 2020 at 8:34 AM Jark Wu  wrote:

 > Hi,
 >
 > I'm very sorry but we just found a blocker issue FLINK-18461 [1] in the new
 > feature of changelog source (CDC).
 > This bug will result in queries on changelog source can’t be inserted into
 > upsert sink (e.g. ES, JDBC, HBase),
 > which is a common case in production. CDC is one of the important features
 > of Table/SQL in this release,
 > so from my side, I hope we can have this fix in 1.11.0, otherwise, this is
 > a broken feature...
 >
 > Again, I am terribly sorry for delaying the release...
 >
 > Best,
 > Jark
 >
 > [1]: https://issues.apache.org/jira/browse/FLINK-18461
 >
 > On Thu, 2 Jul 2020 at 12:02, Zhijiang 
 > wrote:
 >
 > > Hi Thomas,
 > >
 > > Thanks for the efficient feedback.
 > >
 > > Regarding the suggestion of adding the release notes document, I agree
 > > with your point. Maybe we should adjust the vote template accordingly in
 > > the respective wiki to guide the following release processes.
 > >
 > > Regarding the performance regression, could you provide some more details
 > > for our better measurement or reproducing on our sides?
 > > E.g. I guess the topology only includes two vertexes source and sink?
 > > What is the parallelism for every vertex?
 > > The upstream shuffles data to the downstream via rebalance partitioner or
 > > other?
 &

Re: [VOTE] Release 1.11.0, release candidate #4

2020-07-01 Thread Zhijiang
Hi Thomas,

Thanks for the efficient feedback. 

Regarding the suggestion of adding the release notes document, I agree with 
your point. Maybe we should adjust the vote template accordingly in the 
respective wiki to guide the following release processes.

Regarding the performance regression, could you provide some more details for 
our better measurement or reproducing on our sides? 
E.g. I guess the topology only includes two vertexes source and sink? 
What is the parallelism for every vertex?
The upstream shuffles data to the downstream via rebalance partitioner or other?
The checkpoint mode is exactly-once with rocksDB state backend?
The backpressure happened in this case?
How much percentage regression in this case?

Best,
Zhijiang



--
From:Thomas Weise 
Send Time:2020年7月2日(星期四) 09:54
To:dev 
Subject:Re: [VOTE] Release 1.11.0, release candidate #4

Hi Till,

Yes, we don't have the setting in flink-conf.yaml.

Generally, we carry forward the existing configuration and any change to
default configuration values would impact the upgrade.

Yes, since it is an incompatible change I would state it in the release
notes.

Thanks,
Thomas

BTW I found a performance regression while trying to upgrade another
pipeline with this RC. It is a simple Kinesis to Kinesis job. Wasn't able
to pin it down yet, symptoms include increased checkpoint alignment time.

On Wed, Jul 1, 2020 at 12:04 AM Till Rohrmann  wrote:

> Hi Thomas,
>
> just to confirm: When starting the image in local mode, then you don't have
> any of the JobManager memory configuration settings configured in the
> effective flink-conf.yaml, right? Does this mean that you have explicitly
> removed `jobmanager.heap.size: 1024m` from the default configuration? If
> this is the case, then I believe it was more of an unintentional artifact
> that it worked before and it has been corrected now so that one needs to
> specify the memory of the JM process explicitly. Do you think it would help
> to explicitly state this in the release notes?
>
> Cheers,
> Till
>
> On Wed, Jul 1, 2020 at 7:01 AM Thomas Weise  wrote:
>
> > Thanks for preparing another RC!
> >
> > As mentioned in the previous RC thread, it would be super helpful if the
> > release notes that are part of the documentation can be included [1].
> It's
> > a significant time-saver to have read those first.
> >
> > I found one more non-backward compatible change that would be worth
> > addressing/mentioning:
> >
> > It is now necessary to configure the jobmanager heap size in
> > flink-conf.yaml (with either jobmanager.heap.size
> > or jobmanager.memory.heap.size). Why would I not want to do that anyways?
> > Well, we set it dynamically for a cluster deployment via the
> > flinkk8soperator, but the container image can also be used for testing
> with
> > local mode (./bin/jobmanager.sh start-foreground local). That will fail
> if
> > the heap wasn't configured and that's how I noticed it.
> >
> > Thanks,
> > Thomas
> >
> > [1]
> >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/release-notes/flink-1.11.html
> >
> > On Tue, Jun 30, 2020 at 3:18 AM Zhijiang  > .invalid>
> > wrote:
> >
> > > Hi everyone,
> > >
> > > Please review and vote on the release candidate #4 for the version
> > 1.11.0,
> > > as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > > The complete staging area is available for your review, which includes:
> > > * JIRA release notes [1],
> > > * the official Apache source release and binary convenience releases to
> > be
> > > deployed to dist.apache.org [2], which are signed with the key with
> > > fingerprint 2DA85B93244FDFA19A6244500653C0A2CEA00D0E [3],
> > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > * source code tag "release-1.11.0-rc4" [5],
> > > * website pull request listing the new release and adding announcement
> > > blog post [6].
> > >
> > > The vote will be open for at least 72 hours. It is adopted by majority
> > > approval, with at least 3 PMC affirmative votes.
> > >
> > > Thanks,
> > > Release Manager
> > >
> > > [1]
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12346364
> > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.0-rc4/
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [4]
> > >
> https://repository.apache.org/content/repositories/orgapacheflink-1377/
> > > [5] https://github.com/apache/flink/releases/tag/release-1.11.0-rc4
> > > [6] https://github.com/apache/flink-web/pull/352
> > >
> > >
> >
>



[VOTE] Release 1.11.0, release candidate #4

2020-06-30 Thread Zhijiang
Hi everyone,

Please review and vote on the release candidate #4 for the version 1.11.0, as 
follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be 
deployed to dist.apache.org [2], which are signed with the key with fingerprint 
2DA85B93244FDFA19A6244500653C0A2CEA00D0E [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.11.0-rc4" [5],
* website pull request listing the new release and adding announcement blog 
post [6].

The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.

Thanks,
Release Manager

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12346364
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.0-rc4/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1377/
[5] https://github.com/apache/flink/releases/tag/release-1.11.0-rc4
[6] https://github.com/apache/flink-web/pull/352



Re: [VOTE] Release 1.11.0, release candidate #3

2020-06-29 Thread Zhijiang
I would close this RC3 vote as Piotr mentioned below, and the next RC4 would be 
ready for formal vote tomorrow.

Best,
Zhijiang


--
From:Piotr Nowojski 
Send Time:2020年6月26日(星期五) 15:21
To:Marta Paes Moreira 
Cc:dev ; Zhijiang 
Subject:Re: [VOTE] Release 1.11.0, release candidate #3

Hi,

I would vote -1 because of mainly the performance regression that we are 
currently investigating [1] and couple of bug fixes that didn’t make it to RC3, 
but are already fixed on release-1.11 branch [2], [3] and [4].

[1] https://issues.apache.org/jira/browse/FLINK-18433
[2] https://issues.apache.org/jira/browse/FLINK-18426
[3] https://issues.apache.org/jira/browse/FLINK-18428
[4] https://issues.apache.org/jira/browse/FLINK-18429

Piotrek


On 25 Jun 2020, at 16:23, Marta Paes Moreira  wrote:
Thanks, Zhijiang!

The PR for the release announcement blogpost is now available in [1]. Any 
feedback or comments are appreciated!

[1] https://github.com/apache/flink-web/pull/352
On Wed, Jun 24, 2020 at 5:03 PM Zhijiang  
wrote:
Hi everyone,

 This RC3 vote is re-launched for fixing the tag issue ([ANNOUNCE] -> [VOTE]) 
in email title and adjusting the vote period (72 hours -> 48 hours)
  based on the latest considerations.

 Please review and vote on the release candidate #3 for the version 1.11.0, as 
follows:
 [ ] +1, Approve the release
 [ ] -1, Do not approve the release (please provide specific comments)

 The complete staging area is available for your review, which includes:
 * JIRA release notes [1],
 * the official Apache source release and binary convenience releases to be 
deployed to dist.apache.org [2], which are signed with the key with fingerprint 
2DA85B93244FDFA19A6244500653C0A2CEA00D0E [3],
 * all artifacts to be deployed to the Maven Central Repository [4],
 * source code tag "release-1.11.0-rc3" [5],
 * Marta is also preparing a pull request for the announcement blog post in the 
works, and will update this voting thread with a link to the pull request 
shortly afterwards.

 The vote will be open for at least 48 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.

 Note we would probably have the next RC4 based on the latest feedbacks to 
cover mainly cosmetic/API changes, but most other testing efforts should be 
forwardable.


 Thanks,
 Release Manager

 [1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12346364
 [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.0-rc3/
 [3] https://dist.apache.org/repos/dist/release/flink/KEYS
 [4] https://repository.apache.org/content/repositories/orgapacheflink-1376/
 [5] https://github.com/apache/flink/releases/tag/release-1.11.0-rc3



[VOTE] Release 1.11.0, release candidate #3

2020-06-24 Thread Zhijiang
Hi everyone,

This RC3 vote is re-launched for fixing the tag issue ([ANNOUNCE] -> [VOTE]) in 
email title and adjusting the vote period (72 hours -> 48 hours)
 based on the latest considerations.

Please review and vote on the release candidate #3 for the version 1.11.0, as 
follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be 
deployed to dist.apache.org [2], which are signed with the key with fingerprint 
2DA85B93244FDFA19A6244500653C0A2CEA00D0E [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.11.0-rc3" [5],
* Marta is also preparing a pull request for the announcement blog post in the 
works, and will update this voting thread with a link to the pull request 
shortly afterwards.

The vote will be open for at least 48 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.

Note we would probably have the next RC4 based on the latest feedbacks to cover 
mainly cosmetic/API changes, but most other testing efforts should be 
forwardable.


Thanks,
Release Manager

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12346364
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.0-rc3/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1376/
[5] https://github.com/apache/flink/releases/tag/release-1.11.0-rc3

Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #3

2020-06-24 Thread Zhijiang
Thanks for the reminder, Benchao. That is my careless of forgetting changing 
the tag when copy & paste the previous announce title.
I will re-launch a separate vote email for it.

Best,
Zhijiang


--
From:Benchao Li 
Send Time:2020年6月24日(星期三) 20:47
To:dev ; Zhijiang 
Cc:Piotr Nowojski 
Subject:Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #3

Hi Zhijiang & Piotr,

Thanks for preparing RC3 and bringing up the vote.

Does the vote thread title need to be tagged by [VOTE]?
I don’t know if it’s obligatory, just saying it out.

Zhijiang  于2020年6月24日周三 下午7:49写道:

> Hi everyone,
>
> Please review and vote on the release candidate #3 for the version 1.11.0,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint 2DA85B93244FDFA19A6244500653C0A2CEA00D0E [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.11.0-rc3" [5],
> * Marta is also preparing a pull request for the announcement blog post in
> the works, and will update this voting thread with a link to the pull
> request shortly afterwards.
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
> Thanks,
> Release Manager
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12346364
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.0-rc3/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1376/
> [5] https://github.com/apache/flink/releases/tag/release-1.11.0-rc3



-- 

Best,
Benchao Li



[ANNOUNCE] Apache Flink 1.11.0, release candidate #3

2020-06-24 Thread Zhijiang
Hi everyone,

Please review and vote on the release candidate #3 for the version 1.11.0, as 
follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)
The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be 
deployed to dist.apache.org [2], which are signed with the key with fingerprint 
2DA85B93244FDFA19A6244500653C0A2CEA00D0E [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.11.0-rc3" [5],
* Marta is also preparing a pull request for the announcement blog post in the 
works, and will update this voting thread with a link to the pull request 
shortly afterwards.

The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.
Thanks,
Release Manager
[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12346364
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.0-rc3/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1376/
[5] https://github.com/apache/flink/releases/tag/release-1.11.0-rc3

Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #2

2020-06-23 Thread Zhijiang
Hi Thomas,

Thanks for these valuable feedbacks and suggestions, and I think they are very 
helpful for making us better.

I can give an direct answer for this issue:
> checkpoint alignment buffered metric missing - note that this job isn't using 
> the new unaligned checkpointing that should be opt-in.

The metric of checkpoint alignment buffered would be always 0 now, no matter 
with unaligned checkpointing or not, so we removed this metric directly.
The motivation for such change is from reducing in-flight buffers to speed up 
checkpoint somehow. The upstream side would block sending any following
buffers after sending the barrier until receiving the alignment notification 
from downstream side. Therefore, the downstream side never needs to cache
buffers for blocked channels during alignment. We also illustrated such changes 
in release notes for attention by link [1].

[1] 
https://github.com/apache/flink/pull/12699/files#diff-eaa874e007e88f283e96de2d61cc4140R174

Best,
Zhijiang
--
From:Thomas Weise 
Send Time:2020年6月24日(星期三) 06:51
To:dev 
Cc:zhijiang 
Subject:Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #2

Hi,

Thanks for putting together the RC!

I have some preliminary feedback from testing with commit
934f91ead00fd658333f65ffa37ab60bd5ffd99b

An internal benchmark application that reads from Kinesis and checkpoints
~12GB performs comparably to 1.10.1

There were a few issues hit upgrading our codebase that may be worthwhile
considering, please see details below.

Given my observations over the past few releases, I would like to suggest
that the community introduces a log of incompatible changes to be published
with the release notes. Though it is possible to analyze git history when
hitting compile errors, there are more subtle changes that can make
upgrades unnecessarily time-consuming. Contributors introducing such
changes are probably in the best position to document.

I'm planning to try this or the next RC with a couple more applications.

Cheers,
Thomas

* notifyCheckpointAborted needed to be implemented
for org.apache.flink.runtime.state.CheckpointListener - can we have the
default implementation in the interface so that users aren't forced to
change their implementations

* following deprecated configuration values had to be modified to get
the job running:

  "taskmanager.initial-registration-pause": "500ms",
  "taskmanager.max-registration-pause": "5s",
  "taskmanager.refused-registration-pause": "5s",

The error message was:

Could not parse value '500ms' for key
'cluster.registration.initial-timeout'.\n\tat
org.apache.flink.configuration.Configuration.getOptional(Configuration.java:753)\n\tat
org.apache.flink.configuration.Configuration.getLong(Configuration.java:298)\n\tat
org.apache.flink.runtime.registration.RetryingRegistrationConfiguration.fromConfiguration(RetryingRegistrationConfiguration.java:72)\n\tat
org.apache.flink.runtime.taskexecutor.TaskManagerServicesConfiguration.fromConfiguration(TaskManagerServicesConfiguration.java:262)\n\tat

Though easy to fix, it's unfortunate that values are now treated
differently.

* checkpoint alignment buffered metric missing - note that this job isn't
using the new unaligned checkpointing that should be opt-in.

* -import org.apache.flink.table.api.java.StreamTableEnvironment;
  +import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;

 * -ClientUtils.executeProgram(DefaultExecutorServiceLoader.INSTANCE,
config, program.build());
+ClientUtils.executeProgram(DefaultExecutorServiceLoader.INSTANCE,
config, program.build(),
  false, false);

* ProcessingTimeCallback removed from StreamingFileSink


On Wed, Jun 17, 2020 at 6:29 AM Piotr Nowojski  wrote:

> Hi all,
>
> I would like to give an update about the RC2 status. We are now waiting for
> a green azure build on one final bug fix before creating RC2. This bug fix
> should be merged late afternoon/early evening Berlin time, so RC2 will be
> hopefully created tomorrow morning. Until then I would ask to not
> merge/backport commits to release-1.11 branch, including bug fixes. If you
> have something that's truly essential and should be treated as a release
> blocker, please reach out to me or Zhijiang.
>
> Best,
> Piotr Nowojski
>



Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #2

2020-06-23 Thread Zhijiang
Hi Febian,

I do not think that issue would block the current testing purpose, since the 
codes of RC2 will not cover that compile issue. 
You can checkout the RC2 tag [1] for compiling if needed. And we might prepare 
for the next formal votable RC3 soon.


[1] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.0-rc2/

Best,
Zhijiang


--
From:Fabian Paul 
Send Time:2020年6月23日(星期二) 15:41
To:Zhijiang ; dev 
Cc:zhijiang ; pnowojski 
Subject:Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #2

Hi,

Thanks again for uploading the missing artifacts. Unfortunately this rc does 
not fully compile due to [1].

Would it be possible for testing purposed to quickly include this fix into the 
rc or do you think it is necessary to open a complete new one?


[1] https://issues.apache.org/jira/browse/FLINK-18411

Best,
Fabian



Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #2

2020-06-22 Thread Zhijiang
Hi all, 

The previous link [1] for all artifacts in Maven Central Repository missed many 
artifacts, so we deploy them again by the new link [2].
Sorry for the inconvenience and happy testing again!

[1] https://repository.apache.org/content/repositories/orgapacheflink-1374
[2] https://repository.apache.org/content/repositories/orgapacheflink-1375/

Best,
Zhijiang


--
From:Zhijiang 
Send Time:2020年6月22日(星期一) 18:34
To:Fabian Paul ; dev 
Cc:zhijiang ; pnowojski 
Subject:Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #2

Hi Fabian,

Thanks for this finding and let us know. I will double check it and update the 
missing jars afterwards if needed.

Best,
Zhijiang


--
From:Fabian Paul 
Send Time:2020年6月22日(星期一) 16:55
To:dev 
Cc:zhijiang ; pnowojski 
Subject:Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #2

Hi,

Thanks for the great efforts in preparing the second rc. I was just going 
through the published artifacts and it seems that some are missing in the 
latest release.

In comparison you can look at 

https://repository.apache.org/content/repositories/orgapacheflink-1370/org/apache/flink/
 with the full list of artifacts for the first rc and 
https://repository.apache.org/content/repositories/orgapacheflink-1374/org/apache/flink/
 with only a subset for the second one.

Did you only upload the artifacts which have not been changed?

Best,
Fabian




Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #2

2020-06-22 Thread Zhijiang
Hi Fabian,

Thanks for this finding and let us know. I will double check it and update the 
missing jars afterwards if needed.

Best,
Zhijiang


--
From:Fabian Paul 
Send Time:2020年6月22日(星期一) 16:55
To:dev 
Cc:zhijiang ; pnowojski 
Subject:Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #2

Hi,

Thanks for the great efforts in preparing the second rc. I was just going 
through the published artifacts and it seems that some are missing in the 
latest release.

In comparison you can look at 

https://repository.apache.org/content/repositories/orgapacheflink-1370/org/apache/flink/
 with the full list of artifacts for the first rc and 
https://repository.apache.org/content/repositories/orgapacheflink-1374/org/apache/flink/
 with only a subset for the second one.

Did you only upload the artifacts which have not been changed?

Best,
Fabian



Re: [ANNOUNCE] Yu Li is now part of the Flink PMC

2020-06-16 Thread Zhijiang
Congratulations Yu! Well deserved!

Best,
Zhijiang


--
From:Dian Fu 
Send Time:2020年6月17日(星期三) 10:48
To:dev 
Cc:Haibo Sun ; user ; user-zh 

Subject:Re: [ANNOUNCE] Yu Li is now part of the Flink PMC

Congrats Yu!

Regards,
Dian

> 在 2020年6月17日,上午10:35,Jark Wu  写道:
> 
> Congratulations Yu! Well deserved!
> 
> Best,
> Jark
> 
> On Wed, 17 Jun 2020 at 10:18, Haibo Sun  wrote:
> 
>> Congratulations Yu!
>> 
>> Best,
>> Haibo
>> 
>> 
>> At 2020-06-17 09:15:02, "jincheng sun"  wrote:
>>> Hi all,
>>> 
>>> On behalf of the Flink PMC, I'm happy to announce that Yu Li is now
>>> part of the Apache Flink Project Management Committee (PMC).
>>> 
>>> Yu Li has been very active on Flink's Statebackend component, working on
>>> various improvements, for example the RocksDB memory management for 1.10.
>>> and keeps checking and voting for our releases, and also has successfully
>>> produced two releases(1.10.0&1.10.1) as RM.
>>> 
>>> Congratulations & Welcome Yu Li!
>>> 
>>> Best,
>>> Jincheng (on behalf of the Flink PMC)
>> 
>> 



Re: [ANNOUNCE] New Flink Committer: Benchao Li

2020-06-09 Thread Zhijiang
Congratulations, Benchao!

Best,
Zhijiang


--
From:SteNicholas 
Send Time:2020年6月9日(星期二) 15:11
To:dev 
Subject:Re: [ANNOUNCE] New Flink Committer: Benchao Li

Congratulations, Benchao!

Best,
Nicholas



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/



Re: [ANNOUNCE] New Apache Flink Committer - Xintong Song

2020-06-05 Thread Zhijiang
Congratulations! 
Best,
Zhijiang 
--
From:Andrey Zagrebin 
Send Time:2020年6月5日(星期五) 22:34
To:dev 
Subject:Re: [ANNOUNCE] New Apache Flink Committer - Xintong Song
Welcome to committers and congrats, Xintong!

Cheers,
Andrey

On Fri, Jun 5, 2020 at 4:22 PM Till Rohrmann  wrote:

> Congratulations!
>
> Cheers,
> Till
>
> On Fri, Jun 5, 2020 at 10:00 AM Dawid Wysakowicz 
> wrote:
>
> > Congratulations!
> >
> > Best,
> >
> > Dawid
> >
> > On 05/06/2020 09:10, tison wrote:
> > > Congrats, Xintong!
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Jark Wu  于2020年6月5日周五 下午3:00写道:
> > >
> > >> Congratulations Xintong!
> > >>
> > >> Best,
> > >> Jark
> > >>
> > >> On Fri, 5 Jun 2020 at 14:32, Danny Chan  wrote:
> > >>
> > >>> Congratulations Xintong !
> > >>>
> > >>> Best,
> > >>> Danny Chan
> > >>> 在 2020年6月5日 +0800 PM2:20,dev@flink.apache.org,写道:
> > >>>> Congratulations Xintong
> >
>



[jira] [Created] (FLINK-18088) Umbrella for features testing in release-1.11.0

2020-06-03 Thread Zhijiang (Jira)
Zhijiang created FLINK-18088:


 Summary: Umbrella for features testing in release-1.11.0 
 Key: FLINK-18088
 URL: https://issues.apache.org/jira/browse/FLINK-18088
 Project: Flink
  Issue Type: Test
Affects Versions: 1.11.0
Reporter: Zhijiang
 Fix For: 1.11.0


This is the umbrella issue for tracing the testing progress of all the related 
features in release-1.11.0, either the way of e2e or manually testing in 
cluster, to confirm the features work in practice with good quality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18063) Fix the race condition for aborting current checkpoint in CheckpointBarrierUnaligner#processEndOfPartition

2020-06-02 Thread Zhijiang (Jira)
Zhijiang created FLINK-18063:


 Summary: Fix the race condition for aborting current checkpoint in 
CheckpointBarrierUnaligner#processEndOfPartition
 Key: FLINK-18063
 URL: https://issues.apache.org/jira/browse/FLINK-18063
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.11.0
Reporter: Zhijiang
Assignee: Zhijiang
 Fix For: 1.11.0, 1.12.0


In the handle of CheckpointBarrierUnaligner#processEndOfPartition, it only 
aborts the current checkpoint by judging the condition of pending checkpoint 
from task thread processing, so it will miss one scenario that checkpoint 
triggered by notifyBarrierReceived from netty thread.

The proper fix should also judge the pending checkpoint inside 
ThreadSafeUnaligner in order to abort it and reset internal variables in case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18050) Fix the bug of recycling buffer twice once exception in ChannelStateWriteRequestDispatcher#dispatch

2020-06-01 Thread Zhijiang (Jira)
Zhijiang created FLINK-18050:


 Summary: Fix the bug of recycling buffer twice once exception in 
ChannelStateWriteRequestDispatcher#dispatch
 Key: FLINK-18050
 URL: https://issues.apache.org/jira/browse/FLINK-18050
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.11.0
Reporter: Zhijiang
Assignee: Zhijiang
 Fix For: 1.11.0, 1.12.0


During ChannelStateWriteRequestDispatcherImpl#dispatch, `request.cancel(e)` is 
called to recycle the internal buffer of request once exception happens.

But for the case of requesting write output, the buffers would be also finally 
recycled inside ChannelStateCheckpointWriter#write no matter exceptions or not. 
So the buffers in request will be recycled twice in the case of exception, 
which would cause further exceptions in the network shuffle process to 
reference the same buffer.

This bug can be reproduced easily via running 
UnalignedCheckpointITCase#shouldPerformUnalignedCheckpointOnParallelRemoteChannel.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [NOTICE] Release guide updated for updating japicmp configuration

2020-05-29 Thread Zhijiang
Thanks for the updates, Chesnay! 
Really helpful!

Best,
Zhijiang


--
From:Piotr Nowojski 
Send Time:2020年5月29日(星期五) 18:03
To:Chesnay Schepler 
Cc:dev@flink.apache.org ; zhijiang 

Subject:Re: [NOTICE] Release guide updated for updating japicmp configuration

Thanks Chesney for adding those scripts and configuring checks!

Piotrek


On 29 May 2020, at 10:04, Chesnay Schepler  wrote:
Hello everyone,
We recently decided to enforce compatibility for @PublicEvolving APIs for minor 
releases.
This requires modifications to the japicmp-maven-plugin execution on the 
corresponding release-X.Y branch after X.Y.Z was released.
In FLINK-17844 new tooling was added to take care of this 
(tools/releasing/updated_japicmp_configuration.sh), but it must be run manually 
by the release manager, after the release has concluded.
Note that this is also run automatically when an RC is created, as a final 
safeguard in case the manual step is missed.
I have amended the release guide accordingly: Update japicmp configuration
Update the japicmp reference version and enable API compatibility checks for 
@PublicEvolving  APIs on the corresponding SNAPSHOT branch.
For a new major release (x.y.0), run the same command also on the master branch 
for updating the japicmp reference version.  
tools $ NEW_VERSION=$RELEASE_VERSION releasing/update_japicmp_configuration.sh
tools $ cd ..
$ git add *
$ git commit -m "Update japicmp configuration for $RELEASE_VERSION"  



[jira] [Created] (FLINK-17994) Fix the race condition between CheckpointBarrierUnaligner#processBarrier and #notifyBarrierReceived

2020-05-27 Thread Zhijiang (Jira)
Zhijiang created FLINK-17994:


 Summary: Fix the race condition between 
CheckpointBarrierUnaligner#processBarrier and #notifyBarrierReceived
 Key: FLINK-17994
 URL: https://issues.apache.org/jira/browse/FLINK-17994
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Reporter: Zhijiang
Assignee: Zhijiang
 Fix For: 1.11.0


The race condition issue happens as follow:
 * ch1 is received from network by netty thread and schedule the ch1 into 
mailbox via #notifyBarrierReceived
 * ch2 is received from network by netty thread, but before calling 
#notifyBarrierReceived this barrier was inserted into channel's data queue in 
advance. Then it would cause task thread process ch2 earlier than 
#notifyBarrierReceived by netty thread.
 * Task thread would execute checkpoint for ch2 directly because ch2 > ch1.
 * After that, the previous scheduled ch1 is performed from mailbox by task 
thread, then it causes the IllegalArgumentException inside 
SubtaskCheckpointCoordinatorImpl#checkpointState because it breaks the 
assumption that checkpoint is executed in incremental way. 

One possible solution for this race condition is inserting the received barrier 
into channel's data queue after calling #notifyBarrierReceived, then we can 
make the assumption that the checkpoint is always triggered by netty thread, to 
simplify the current situation that checkpoint might be triggered either by 
task thread or netty thread. 

To do so we can also avoid accessing #notifyBarrierReceived method by task 
thread while processing the barrier to simplify the logic inside 
CheckpointBarrierUnaligner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17992) Exception from RemoteInputChannel#onBuffer should not fail the whole NetworkClientHandler

2020-05-27 Thread Zhijiang (Jira)
Zhijiang created FLINK-17992:


 Summary: Exception from RemoteInputChannel#onBuffer should not 
fail the whole NetworkClientHandler
 Key: FLINK-17992
 URL: https://issues.apache.org/jira/browse/FLINK-17992
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network
Affects Versions: 1.10.1, 1.10.0
Reporter: Zhijiang
Assignee: Zhijiang
 Fix For: 1.11.0


RemoteInputChannel#onBuffer is invoked by 
CreditBasedPartitionRequestClientHandler while receiving and decoding the 
network data. #onBuffer can throw exceptions which would tag the error in 
client handler and fail all the added input channels inside handler. Then it 
would cause a tricky potential issue as following.

If the RemoteInputChannel is canceling by canceler thread, then the task thread 
might exit early than canceler thread terminate. That means the 
PartitionRequestClient might not be closed (triggered by canceler thread) while 
the new task attempt is already deployed into this TaskManger. Therefore the 
new task might reuse the previous PartitionRequestClient while requesting 
partitions, but note that the respective client handler was already tagged an 
error before during above RemoteInputChannel#onBuffer. It will cause the next 
round unnecessary failover.

It is hard to find this potential issue in production because it can be 
restored normal finally after one or more additional failover. We find this 
potential problem from UnalignedCheckpointITCase because it will define the 
precise restart times within configured failures.

The solution is to only fail the respective task when its internal 
RemoteInputChannel#onBuffer throws any exceptions instead of failing the whole 
channels inside client handler, then the client is still health and can also be 
reused by other input channels as long as it is not released yet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Backpoint FLIP-126 (watermarks) integration with FLIP-27

2020-05-26 Thread Zhijiang
In the beginning, I have somehow similar concerns as Piotr mentioned below.
After some offline discussions, also as explained by Stephan and Becket here, I 
am +1 to backport it to release-1.11.

Best,
Zhijiang


--
From:Piotr Nowojski 
Send Time:2020年5月26日(星期二) 18:51
To:Becket Qin 
Cc:Stephan Ewen ; dev ; zhijiang 

Subject:Re: [DISCUSS] Backpoint FLIP-126 (watermarks) integration with FLIP-27

Hi,

As we discussed this offline a bit, initially I was sceptical to merge it,
as:
- even it’s an isolated change, it can destabilise the builds and prolong
release testing period
- is distracting from solving release blockers etc

However all in all I’m +0.5 to merge it because of this argument:

> - It is API breaking. Without this patch, we would release a Source API
and immediately break compatibility in the next release.

And this:

>  - It is a fairly isolated change, does not affect any existing feature
in the system

Is limiting our risks, that we are not risking introducing bugs into the
existing features.

Piotrek

wt., 26 maj 2020 o 12:43 Becket Qin  napisał(a):

> Usually we should avoid checking in patches other than bug fix after
> feature freeze. However, in this particular case, the code base is sort of
> in an incomplete state - an exposed known-to-change feature - due to
> missing this patch. Fixing forward seems the best option. Besides that,
> FLIP-27 has been highly anticipated by many users. So if one patch
> completes the story, personally speaking I am +1 to backport given the
> isolated impact and significant benefit of doing that.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
> On Tue, May 26, 2020 at 4:43 PM Stephan Ewen  wrote:
>
>> Hi all!
>>
>> I want to discuss merging this PR to the 1.11 release branch:
>> https://github.com/apache/flink/pull/12306
>>
>> It contains the new FLIP-126 Watermarks, and per-partition watermarking
>> to the FLIP-27 sources. In that sense it is partially a new feature after
>> the feature freeze. Hence this discussion, and not just merging.
>>
>> The reasons why I suggest to back-port this to 1.11 are
>>   - It is API breaking. Without this patch, we would release a Source API
>> and immediately break compatibility in the next release.
>>   - The FLIP-27 feature is experimental, but it should not be useless in
>> the sense that users have to re-write all implemented sources in the next
>> release.
>>   - It is a fairly isolated change, does not affect any existing feature
>> in the system
>>
>> Please let me know if you have concerns about this.
>>
>> Best,
>> Stephan
>>
>>



Re: [DISCUSS] Release flink-shaded 11.0 (and include it in 1.11.0)

2020-05-25 Thread Zhijiang
Thanks for driving this, Chesnay! 
+1 on my side.

Best,
Zhijiang
--
From:Till Rohrmann 
Send Time:2020年5月25日(星期一) 16:28
To:dev 
Subject:Re: [DISCUSS] Release flink-shaded 11.0 (and include it in 1.11.0)

+1 for the new flink-shaded release.

Cheers,
Till

On Mon, May 25, 2020 at 9:06 AM Chesnay Schepler  wrote:

> Hello,
>
> I would like to do another flink-shaded release for 1.11.0, to include a
> zookeeper 3.4 security fix and resolve a shading issue when working with
> Gradle.
>
>
>



Re: [DISCUSS] Semantics of our JIRA fields

2020-05-25 Thread Zhijiang
Thanks for launching this discussion and giving so detailed infos, Robert!  +1 
on my side for the proposal. 

For "Affects Version", I previously thought it was only for the already 
released versions, so it can give a reminder that the fix should also pick into 
the related released branches for future minor versions.
I saw that Jark had somehow similar concerns for this field in below replies.  
Either way makes sense for me as long as we give a determined rule in Wiki.

Re Flavio' s comments, I agree that the Jira reporter can leave most of the 
fields empty if not confirmed of them, then the respective component maintainer 
or committer can update them accordingly later.
But the state of Jira should not be marked as "resolved" when the PR is 
detected, that is not fitting into the resolved semantic I guess. If possible, 
the Jira can be updated as "in progress" automatically if
the respective PR is ready, then it will save some time to stat progress of 
related issues during release process.

Best,
Zhijiang
--
From:Congxian Qiu 
Send Time:2020年5月25日(星期一) 13:57
To:dev@flink.apache.org 
Subject:Re: [DISCUSS] Semantics of our JIRA fields

Hi

Currently, when I'm going to create an issue for Project-website. I'm not
very sure what the "Affect Version/s" should be set. The problem is that
the current Dockfile[1] in flink-web is broken because of the EOL of Ubuntu
18.10[2], the project-web does not affect any release of Flink, it does
affect the process to build website, so what's the version should I set to?

[1]
https://github.com/apache/flink-web/blob/bc66f0f0f463ab62a22e81df7d7efd301b76a6b4/docker/Dockerfile#L17
[2] https://wiki.ubuntu.com/Releases

Best,
Congxian


Flavio Pompermaier  于2020年5月24日周日 下午1:23写道:

> In my experience it's quite complicated for a normal reporter to be able to
> fill all the fields correctly (especially for new users).
> Usually you just wanto to report a problem, remember to add a new feature
> or improve code/documentation but you can't really give a priority, assign
> the correct label or decide which releases will benefit of the fix/new
> feature. This is something that only core developers could decide (IMHO).
>
> My experiece says that it's better if normal users could just open tickets
> with some default (or mark ticket as new) and leave them in such a state
> until an experienced user, one that can assign tickets, have the time to
> read it and immediately reject the ticket or accept it and properly assign
> priorities, fix version, etc.
>
> With respect to resolve/close I think that a good practice could be to mark
> automatically a ticket as resolved once that a PR is detected for that
> ticket, while marking it as closed should be done by the commiter who merge
> the PR.
>
> Probably this process would slightly increase the work of a committer but
> this would make things a lot cleaner and will bring the benefit of having a
> reliable and semantically sound JIRA state.
>
> Cheers,
> Flavio
>
> Il Dom 24 Mag 2020, 05:05 Israel Ekpo  ha scritto:
>
> > +1 for the proposal
> >
> > This will bring some consistency to the process
> >
> > Regarding Closed vs Resolved, should we go back and switch all the
> Resolved
> > issues to Closed so that is is not confusing to new people to the project
> > in the future that may not have seen this discussion?
> >
> > I would recommend changing it to Closed just to be consistent since that
> is
> > the majority and the new process will be using Closed going forward
> >
> > Those are my thoughts for now
> >
> > On Sat, May 23, 2020 at 7:48 AM Congxian Qiu 
> > wrote:
> >
> > > +1 for the proposal. It's good to have a unified description and write
> it
> > > down in the wiki, so that every contributor has the same understanding
> of
> > > all the fields.
> > >
> > > Best,
> > > Congxian
> > >
> > >
> > > Till Rohrmann  于2020年5月23日周六 上午12:04写道:
> > >
> > > > Thanks for drafting this proposal Robert. +1 for the proposal.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Fri, May 22, 2020 at 5:39 PM Leonard Xu 
> wrote:
> > > >
> > > > > Thanks bringing up this topic @Robert,  +1 to the proposal.
> > > > >
> > > > > It clarifies the JIRA fields well and should be a rule to follow.
> > > > >
> > > > > Best,
> > > > > Leonard Xu
> > > > > > 在 2020年5月22日,20:24,Aljoscha Krettek  写道:
> > > > > >
> > > > > > +1 That's also how I think

[ANNOUNCE] Apache Flink 1.11.0, release candidate #1

2020-05-24 Thread Zhijiang
Hi all,

Apache Flink-1.11.0-RC1 has been created. It has all the artifacts that we 
would typically have for a release.

This preview-only RC is created only to drive the current testing efforts, and 
no official vote will take place. It includes the following:

   * The preview source release and binary convenience releases [1], which are 
signed with the key with fingerprint 2DA85B93244FDFA19A6244500653C0A2CEA00D0E 
[2],
   * All artifacts that would normally be deployed to the Maven Central 
Repository [3]

To test with these artifacts, you can create a settings.xml file with the 
content shown below [4]. This settings file can be referenced in your maven 
commands
via --settings /path/to/settings.xml. This is useful for creating a quickstart 
project based on the staged release and also for building against the staged 
jars.

Happy testing!

Best,
Zhijiang

[1] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.0-rc1/
[2] https://dist.apache.org/repos/dist/release/flink/KEYS
[3] https://repository.apache.org/content/repositories/orgapacheflink-1370/
[4]


 flink-1.11.0



flink-1.11.0

  
flink-1.11.0

https://repository.apache.org/content/repositories/orgapacheflink-1370/
 
 
   archetype
   
https://repository.apache.org/content/repositories/orgapacheflink-1370/
 
 




[jira] [Created] (FLINK-17869) Fix the race condition of aborting unaligned checkpoint

2020-05-21 Thread Zhijiang (Jira)
Zhijiang created FLINK-17869:


 Summary: Fix the race condition of aborting unaligned checkpoint
 Key: FLINK-17869
 URL: https://issues.apache.org/jira/browse/FLINK-17869
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Reporter: Zhijiang
Assignee: Zhijiang
 Fix For: 1.11.0


On ChannelStateWriter side, the lifecycle of checkpoint should be as follows:

start -> in progress/abort -> stop.

We must guarantee that #abort should be queued after #start, otherwise the 
aborted checkpoint might be started later again in the case of race condition.

There are two cases might trigger abort checkpoint:
 * One is CheckpointBarrierUnaligner#processEndOfPartition, which should abort 
all the current and future checkpoints, no need to judge the condition 
`isCheckpointPending()` as current code did. 
 * Another is CheckpointBarrierUnaligner#processCancellationBarrier, which 
should only abort the respective checkpoint id if already triggered before.

The unaligned checkpoint might be triggered either by task thread or netty 
thread inside ThreadSafeUnaligner. Anyway we should know the current triggered 
checkpoint id in order to handle both above cases properly.

Another bug is that during ChannelStateWriterImpl#abort, we should not remove 
the respective ChannelStateWriteResult. Otherwise it would throw 
IllegalArgumentException when ChannelStateWriterImpl#getWriteResult in the 
process of checkpoint. ChannelStateWriteResult should be created at #start 
method and only removed at #stop method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Releasing Stateful Functions 2.1.0 soon?

2020-05-20 Thread Zhijiang
I also like this idea, considering stateful functions flexible enough to have a 
faster release cycle. +1 from my side.

Best,
Zhijiang


--
From:Seth Wiesman 
Send Time:2020年5月20日(星期三) 21:45
To:dev 
Subject:Re: [DISCUSS] Releasing Stateful Functions 2.1.0 soon?

+1 for a fast release cycle

Seth

On Wed, May 20, 2020 at 8:43 AM Robert Metzger  wrote:

> I like the idea of releasing Statefun more frequently to have faster
> feedback cycles!
>
> No objections for releasing 2.1.0 from my side.
>
> On Wed, May 20, 2020 at 2:22 PM Tzu-Li (Gordon) Tai 
> wrote:
>
> > Hi devs,
> >
> > Since Stateful Functions 2.0 was released early April,
> > we've been getting some good feedback from various channels,
> > including the Flink mailing lists, JIRA issues, as well as Stack Overflow
> > questions.
> >
> > Some of the discussions have actually translated into new features
> > currently being implemented into the project, such as:
> >
> >- State TTL for the state primitives in Stateful Functions (for both
> >embedded/remote functions)
> >- Transport for remote functions via UNIX domain sockets, which would
> be
> >useful when remote functions are co-located with Flink StateFun
> workers
> >(i.e. the "sidecar" deployment mode)
> >
> >
> > Besides that, some critical shortcomings have already been addressed
> since
> > the last release:
> >
> >- After upgrading to Flink 1.10.1, failure recovery in Stateful
> >Functions now works properly with the new scheduler.
> >- Support for concurrent checkpoints
> >
> >
> > With these ongoing threads, while it's only been just short of 2 months
> > since the last release,
> > we (Igal Shilman and I) have been thinking about aiming to already start
> > the next feature release (2.1.0) soon.
> > This is relatively shorter than the release cycle of what the community
> is
> > used to in Flink (usually 3 months at least),
> > but we think with the StateFun project in its early phases, having
> smaller
> > and more frequent feature releases could potentially help drive user
> > adoption.
> >
> > So, what do you think about setting feature freeze for StateFun 2.1.0 by
> > next Wednesday (May 27th)?
> > Of course, whether or not to actually have another feature release
> already
> > is still an open discussion - if you prefer a richer feature release with
> > more features included besides the ones listed above, please do comment!
> >
> > Cheers,
> > Gordon
> >
>



[jira] [Created] (FLINK-17823) Resolve the race condition while releasing RemoteInputChannel

2020-05-19 Thread Zhijiang (Jira)
Zhijiang created FLINK-17823:


 Summary: Resolve the race condition while releasing 
RemoteInputChannel
 Key: FLINK-17823
 URL: https://issues.apache.org/jira/browse/FLINK-17823
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network
Affects Versions: 1.11.0
Reporter: Zhijiang
Assignee: Zhijiang
 Fix For: 1.11.0


RemoteInputChannel#releaseAllResources might be called by canceler thread. 
Meanwhile, the task thread can also call RemoteInputChannel#getNextBuffer. 
There probably cause two potential problems:
 * Task thread might get null buffer after canceler thread already released all 
the buffers, then it might cause misleading NPE in getNextBuffer.
 * Task thread and canceler thread might pull the same buffer concurrently, 
which causes unexpected exception when the same buffer is recycled twice.

The solution is to properly synchronize the buffer queue in release method to 
avoid the same buffer pulled by both canceler thread and task thread. And in 
getNextBuffer method, we add some explicit checks to avoid misleading NPE and 
hint some valid exceptions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] Apache Flink 1.10.1 released

2020-05-18 Thread Zhijiang
Thanks Yu for the release manager and everyone involved in.

Best,
Zhijiang
--
From:Arvid Heise 
Send Time:2020年5月18日(星期一) 23:17
To:Yangze Guo 
Cc:dev ; Apache Announce List ; user 
; Yu Li ; user-zh 

Subject:Re: [ANNOUNCE] Apache Flink 1.10.1 released

Thank you very much!

On Mon, May 18, 2020 at 8:28 AM Yangze Guo  wrote:
Thanks Yu for the great job. Congrats everyone who made this release possible.
 Best,
 Yangze Guo

 On Mon, May 18, 2020 at 10:57 AM Leonard Xu  wrote:
 >
 >
 > Thanks Yu for being the release manager, and everyone else who made this 
 > possible.
 >
 > Best,
 > Leonard Xu
 >
 > 在 2020年5月18日,10:43,Zhu Zhu  写道:
 >
 > Thanks Yu for being the release manager. Thanks everyone who made this 
 > release possible!
 >
 > Thanks,
 > Zhu Zhu
 >
 > Benchao Li  于2020年5月15日周五 下午7:51写道:
 >>
 >> Thanks Yu for the great work, and everyone else who made this possible.
 >>
 >> Dian Fu  于2020年5月15日周五 下午6:55写道:
 >>>
 >>> Thanks Yu for managing this release and everyone else who made this 
 >>> release possible. Good work!
 >>>
 >>> Regards,
 >>> Dian
 >>>
 >>> 在 2020年5月15日,下午6:26,Till Rohrmann  写道:
 >>>
 >>> Thanks Yu for being our release manager and everyone else who made the 
 >>> release possible!
 >>>
 >>> Cheers,
 >>> Till
 >>>
 >>> On Fri, May 15, 2020 at 9:15 AM Congxian Qiu  
 >>> wrote:
 >>>>
 >>>> Thanks a lot for the release and your great job, Yu!
 >>>> Also thanks to everyone who made this release possible!
 >>>>
 >>>> Best,
 >>>> Congxian
 >>>>
 >>>>
 >>>> Yu Li  于2020年5月14日周四 上午1:59写道:
 >>>>>
 >>>>> The Apache Flink community is very happy to announce the release of 
 >>>>> Apache Flink 1.10.1, which is the first bugfix release for the Apache 
 >>>>> Flink 1.10 series.
 >>>>>
 >>>>> Apache Flink(r) is an open-source stream processing framework for 
 >>>>> distributed, high-performing, always-available, and accurate data 
 >>>>> streaming applications.
 >>>>>
 >>>>> The release is available for download at:
 >>>>> https://flink.apache.org/downloads.html
 >>>>>
 >>>>> Please check out the release blog post for an overview of the 
 >>>>> improvements for this bugfix release:
 >>>>> https://flink.apache.org/news/2020/05/12/release-1.10.1.html
 >>>>>
 >>>>> The full release notes are available in Jira:
 >>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12346891
 >>>>>
 >>>>> We would like to thank all contributors of the Apache Flink community 
 >>>>> who made this release possible!
 >>>>>
 >>>>> Regards,
 >>>>> Yu
 >>>
 >>>
 >>
 >>
 >> --
 >>
 >> Benchao Li
 >> School of Electronics Engineering and Computer Science, Peking University
 >> Tel:+86-15650713730
 >> Email: libenc...@gmail.com; libenc...@pku.edu.cn
 >
 >


-- 
Arvid Heise | Senior Java Developer

Follow us @VervericaData
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time
--
Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
--Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) 
Cheng



Re: [VOTE] Guarantee that @PublicEvolving classes are API and binary compatible across bug fix releases (x.y.u and x.y.v)

2020-05-15 Thread Zhijiang
Sounds good, +1.

Best,
Zhijiang


--
From:Thomas Weise 
Send Time:2020年5月15日(星期五) 21:33
To:dev 
Subject:Re: [VOTE] Guarantee that @PublicEvolving classes are API and binary 
compatible across bug fix releases (x.y.u and x.y.v)

+1


On Fri, May 15, 2020 at 6:15 AM Till Rohrmann  wrote:

> Dear community,
>
> with reference to the dev ML thread about guaranteeing API and binary
> compatibility for @PublicEvolving classes across bug fix releases [1] I
> would like to start a vote about it.
>
> The proposal is that the Flink community starts to guarantee
> that @PublicEvolving classes will be API and binary compatible across bug
> fix releases of the same minor version. This means that a version x.y.u is
> API and binary compatible to x.y.v with u <= v wrt all @PublicEvolving
> classes.
>
> The voting options are the following:
>
> * +1, Provide the above described guarantees
> * -1, Do not provide the above described guarantees (please provide
> specific comments)
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval with at least 3 PMC affirmative votes.
>
> [1]
>
> https://lists.apache.org/thread.html/rb0d0f887b291a490ed3773352c90ddf5f11e3d882dc501e3b8cf0ed0%40%3Cdev.flink.apache.org%3E
>
> Cheers,
> Till
>



Re: [DISCUSS] Exact feature freeze date

2020-05-15 Thread Zhijiang
+1 for Monday morning in Europe.

Best,
Zhijiang


--
From:Yun Tang 
Send Time:2020年5月15日(星期五) 21:58
To:dev 
Subject:Re: [DISCUSS] Exact feature freeze date

+1 for Monday morning in Europe.

Best
Yun Tang

From: Jingsong Li 
Sent: Friday, May 15, 2020 21:17
To: dev 
Subject: Re: [DISCUSS] Exact feature freeze date

+1 for Monday morning.

Best,
Jingsong Lee

On Fri, May 15, 2020 at 8:45 PM Till Rohrmann  wrote:

> +1 from my side extend the feature freeze until Monday morning.
>
> Cheers,
> Till
>
> On Fri, May 15, 2020 at 2:04 PM Robert Metzger 
> wrote:
>
> > I'm okay, but I would suggest to agree on a time of day. What about
> Monday
> > morning in Europe?
> >
> > On Fri, May 15, 2020 at 1:43 PM Piotr Nowojski 
> > wrote:
> >
> > > Hi,
> > >
> > > Couple of contributors asked for extending cutting the release branch
> > > until Monday, what do you think about such extension?
> > >
> > > (+1 from my side)
> > >
> > > Piotrek
> > >
> > > > On 25 Apr 2020, at 21:24, Yu Li  wrote:
> > > >
> > > > +1 for extending the feature freeze to May 15th.
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > > >
> > > > On Fri, 24 Apr 2020 at 14:43, Yuan Mei 
> wrote:
> > > >
> > > >> +1
> > > >>
> > > >> On Thu, Apr 23, 2020 at 4:10 PM Stephan Ewen 
> > wrote:
> > > >>
> > > >>> Hi all!
> > > >>>
> > > >>> I want to bring up a discussion about when we want to do the
> feature
> > > >> freeze
> > > >>> for 1.11.
> > > >>>
> > > >>> When kicking off the release cycle, we tentatively set the date to
> > end
> > > of
> > > >>> April, which would be in one week.
> > > >>>
> > > >>> I can say from the features I am involved with (FLIP-27, FLIP-115,
> > > >>> reviewing some state backend improvements, etc.) that it would be
> > > helpful
> > > >>> to have two additional weeks.
> > > >>>
> > > >>> When looking at various other feature threads, my feeling is that
> > there
> > > >> are
> > > >>> more contributors and committers that could use a few more days.
> > > >>> The last two months were quite exceptional in and we did lose a bit
> > of
> > > >>> development speed here and there.
> > > >>>
> > > >>> How do you think about making *May 15th* the feature freeze?
> > > >>>
> > > >>> Best,
> > > >>> Stephan
> > > >>>
> > > >>
> > >
> > >
> >
>


--
Best, Jingsong Lee



[jira] [Created] (FLINK-17719) Provide ChannelStateReader#hasStates for hints of reading channel states

2020-05-15 Thread Zhijiang (Jira)
Zhijiang created FLINK-17719:


 Summary: Provide ChannelStateReader#hasStates for hints of reading 
channel states
 Key: FLINK-17719
 URL: https://issues.apache.org/jira/browse/FLINK-17719
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Checkpointing, Runtime / Task
Reporter: Zhijiang
Assignee: Zhijiang


Currently we rely on whether unaligned checkpoint is enabled to determine 
whether to read recovered states during task startup, then it will block the 
requirements of recovery from previous unaligned states even though the current 
mode is aligned.

We can make `ChannelStateReader` provide the hint whether there are any channel 
states to be read during startup, then we will never lose any chances to 
recover from them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17413) Remove redundant states from ThreadSafeUnaligner

2020-04-27 Thread Zhijiang (Jira)
Zhijiang created FLINK-17413:


 Summary: Remove redundant states from ThreadSafeUnaligner
 Key: FLINK-17413
 URL: https://issues.apache.org/jira/browse/FLINK-17413
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Checkpointing
Reporter: Zhijiang
Assignee: Zhijiang


In RemoteInputChannel, we already have the states  of 
`lastRequestedCheckpointId` and `receivedCheckpointId` to control whether the 
received buffer should be notified to unaligner component. 

In current ThreadSafeUnaligner, the variable `storeNewBuffers` is also used for 
similar purpose to deciding whether the notified buffer should be written into 
persister. In other words, as long as the `RemoteInputChannel` decides to 
notify this received buffer, it should be always needed to spill. So we can 
remove the variable `storeNewBuffers` from ThreadSafeUnaligner completely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] Apache Flink 1.9.3 released

2020-04-27 Thread Zhijiang
Thanks Dian for the release work and thanks everyone involved. 

Best,
Zhijiang


--
From:Till Rohrmann 
Send Time:2020 Apr. 27 (Mon.) 15:13
To:Jingsong Li 
Cc:dev ; Leonard Xu ; Benchao Li 
; Konstantin Knauf ; jincheng 
sun ; Hequn Cheng ; Dian Fu 
; user ; user-zh 
; Apache Announce List 
Subject:Re: [ANNOUNCE] Apache Flink 1.9.3 released

Thanks Dian for being our release manager and thanks to everyone who helped
making this release possible.

Cheers,
Till

On Mon, Apr 27, 2020 at 3:26 AM Jingsong Li  wrote:

> Thanks Dian for managing this release!
>
> Best,
> Jingsong Lee
>
> On Sun, Apr 26, 2020 at 7:17 PM Jark Wu  wrote:
>
>> Thanks Dian for being the release manager and thanks all who make this
>> possible.
>>
>> Best,
>> Jark
>>
>> On Sun, 26 Apr 2020 at 18:06, Leonard Xu  wrote:
>>
>> > Thanks Dian for the release and being the release manager !
>> >
>> > Best,
>> > Leonard Xu
>> >
>> >
>> > 在 2020年4月26日,17:58,Benchao Li  写道:
>> >
>> > Thanks Dian for the effort, and all who make this release possible.
>> Great
>> > work!
>> >
>> > Konstantin Knauf  于2020年4月26日周日 下午5:21写道:
>> >
>> >> Thanks for managing this release!
>> >>
>> >> On Sun, Apr 26, 2020 at 3:58 AM jincheng sun > >
>> >> wrote:
>> >>
>> >>> Thanks for your great job, Dian!
>> >>>
>> >>> Best,
>> >>> Jincheng
>> >>>
>> >>>
>> >>> Hequn Cheng  于2020年4月25日周六 下午8:30写道:
>> >>>
>> >>>> @Dian, thanks a lot for the release and for being the release
>> manager.
>> >>>> Also thanks to everyone who made this release possible!
>> >>>>
>> >>>> Best,
>> >>>> Hequn
>> >>>>
>> >>>> On Sat, Apr 25, 2020 at 7:57 PM Dian Fu  wrote:
>> >>>>
>> >>>>> Hi everyone,
>> >>>>>
>> >>>>> The Apache Flink community is very happy to announce the release of
>> >>>>> Apache Flink 1.9.3, which is the third bugfix release for the
>> Apache Flink
>> >>>>> 1.9 series.
>> >>>>>
>> >>>>> Apache Flink(r) is an open-source stream processing framework for
>> >>>>> distributed, high-performing, always-available, and accurate data
>> streaming
>> >>>>> applications.
>> >>>>>
>> >>>>> The release is available for download at:
>> >>>>> https://flink.apache.org/downloads.html
>> >>>>>
>> >>>>> Please check out the release blog post for an overview of the
>> >>>>> improvements for this bugfix release:
>> >>>>> https://flink.apache.org/news/2020/04/24/release-1.9.3.html
>> >>>>>
>> >>>>> The full release notes are available in Jira:
>> >>>>> https://issues.apache.org/jira/projects/FLINK/versions/12346867
>> >>>>>
>> >>>>> We would like to thank all contributors of the Apache Flink
>> community
>> >>>>> who made this release possible!
>> >>>>> Also great thanks to @Jincheng for helping finalize this release.
>> >>>>>
>> >>>>> Regards,
>> >>>>> Dian
>> >>>>>
>> >>>>
>> >>
>> >> --
>> >> Konstantin Knauf | Head of Product
>> >> +49 160 91394525
>> >>
>> >> Follow us @VervericaData Ververica <https://www.ververica.com/>
>> >>
>> >> --
>> >> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>> >> Conference
>> >> Stream Processing | Event Driven | Real Time
>> >> --
>> >> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>> >> --
>> >> Ververica GmbH
>> >> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>> >> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
>> >> (Tony) Cheng
>> >>
>> >
>> >
>> > --
>> >
>> > Benchao Li
>> > School of Electronics Engineering and Computer Science, Peking
>> University
>> > Tel:+86-15650713730
>> > Email: libenc...@gmail.com; libenc...@pku.edu.cn
>> >
>> >
>> >
>>
>
>
> --
> Best, Jingsong Lee
>



[jira] [Created] (FLINK-17389) LocalExecutorITCase.testBatchQueryCancel asserts error

2020-04-26 Thread Zhijiang (Jira)
Zhijiang created FLINK-17389:


 Summary: LocalExecutorITCase.testBatchQueryCancel asserts error
 Key: FLINK-17389
 URL: https://issues.apache.org/jira/browse/FLINK-17389
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Reporter: Zhijiang
 Fix For: 1.11.0


CI [https://api.travis-ci.org/v3/job/679144612/log.txt]
{code:java}
19:28:13.121 [INFO] ---
19:28:13.121 [INFO]  T E S T S
19:28:13.121 [INFO] ---
19:28:17.231 [INFO] Running 
org.apache.flink.table.client.gateway.local.LocalExecutorITCase
19:32:06.049 [ERROR] Tests run: 70, Failures: 1, Errors: 0, Skipped: 5, Time 
elapsed: 228.813 s <<< FAILURE! - in 
org.apache.flink.table.client.gateway.local.LocalExecutorITCase
19:32:06.051 [ERROR] testBatchQueryCancel[Planner: 
old](org.apache.flink.table.client.gateway.local.LocalExecutorITCase)  Time 
elapsed: 32.767 s  <<< FAILURE!
java.lang.AssertionError: expected: but was:
at 
org.apache.flink.table.client.gateway.local.LocalExecutorITCase.testBatchQueryCancel(LocalExecutorITCase.java:738)

19:32:06.440 [INFO] 
19:32:06.440 [INFO] Results:
19:32:06.440 [INFO] 
19:32:06.440 [ERROR] Failures: 
19:32:06.440 [ERROR]   LocalExecutorITCase.testBatchQueryCancel:738 
expected: but was:
19:32:06.440 [INFO] 
19:32:06.440 [ERROR] Tests run: 70, Failures: 1, Errors: 0, Skipped: 5{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Exact feature freeze date

2020-04-23 Thread Zhijiang
+1 for extending the feature freeze until May 15th.

Best,
Zhijiang


--
From:Danny Chan 
Send Time:2020 Apr. 24 (Fri.) 10:51
To:dev 
Subject:Re: [DISCUSS] Exact feature freeze date

+1 for extending the feature freeze until May 15th.

Best,
Danny Chan
在 2020年4月24日 +0800 AM9:51,Yangze Guo ,写道:
> +1
>
> Best,
> Yangze Guo
>
> On Fri, Apr 24, 2020 at 9:49 AM Dian Fu  wrote:
> >
> > +1
> >
> > Regards,
> > Dian
> >
> > > 在 2020年4月24日,上午9:47,Leonard Xu  写道:
> > >
> > > + 1 for the feature freeze date
> > >
> > > Best,
> > > Leonard Xu
> > >
> > >
> > > > 在 2020年4月24日,09:32,Jingsong Li  写道:
> > > >
> > > > +1
> > > >
> > > > Best,
> > > > Jingsong Lee
> > > >
> > > > On Fri, Apr 24, 2020 at 2:27 AM Zhu Zhu  wrote:
> > > >
> > > > > +1 for extending the code freeze date.
> > > > > FLIP-119 could benefit from it.
> > > > > May 15th sounds reasonable.
> > > > >
> > > > > Thanks,
> > > > > Zhu Zhu
> > > > >
> > > > > Jark Wu  于2020年4月24日周五 上午12:01写道:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > Thanks,
> > > > > > Jark
> > > > > >
> > > > > > On Thu, 23 Apr 2020 at 22:36, Xintong Song 
> > > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > > From our side we can also benefit from the extending of feature 
> > > > > > > freeze,
> > > > > > for
> > > > > > > pluggable slot allocation, GPU support and perjob mode on 
> > > > > > > Kubernetes
> > > > > > > deployment.
> > > > > > >
> > > > > > > Thank you~
> > > > > > >
> > > > > > > Xintong Song
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Apr 23, 2020 at 10:31 PM Timo Walther 
> > > > > > wrote:
> > > > > > >
> > > > > > > > From the SQL side, I'm sure that FLIP-95 and FLIP-105 could 
> > > > > > > > benefit
> > > > > > > > from extending the feature freeze.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Timo
> > > > > > > >
> > > > > > > > On 23.04.20 16:11, Aljoscha Krettek wrote:
> > > > > > > > > +1
> > > > > > > > >
> > > > > > > > > Aljoscha
> > > > > > > > >
> > > > > > > > > On 23.04.20 15:23, Till Rohrmann wrote:
> > > > > > > > > > +1 for extending the feature freeze until May 15th.
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > > Till
> > > > > > > > > >
> > > > > > > > > > On Thu, Apr 23, 2020 at 1:00 PM Piotr Nowojski <
> > > > > pi...@ververica.com
> > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Stephan,
> > > > > > > > > > >
> > > > > > > > > > > As release manager I’ve seen that quite a bit of features 
> > > > > > > > > > > could
> > > > > use
> > > > > > > > > > > of the
> > > > > > > > > > > extra couple of weeks. This also includes some features 
> > > > > > > > > > > that I’m
> > > > > > > > > > > involved
> > > > > > > > > > > with, like FLIP-76, or limiting the in-flight buffers.
> > > > > > > > > > >
> > > > > > > > > > > +1 From my side for extending the feature freeze until 
> > > > > > > > > > > May 15th.
> > > > > > > > > > >
> > > > > > > > > > > Piotrek
> > > > > > > > > > >
> > > > > > > > > > > > On 23 Apr 2020, at 10:10, Stephan Ewen 
> > > > > > > > > > > > 
> > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi all!
> > > > > > > > > > > >
> > > > > > > > > > > > I want to bring up a discussion about when we want to 
> > > > > > > > > > > > do the
> > > > > > feature
> > > > > > > > > > > freeze
> > > > > > > > > > > > for 1.11.
> > > > > > > > > > > >
> > > > > > > > > > > > When kicking off the release cycle, we tentatively set 
> > > > > > > > > > > > the date
> > > > > to
> > > > > > > > > > > > end of
> > > > > > > > > > > > April, which would be in one week.
> > > > > > > > > > > >
> > > > > > > > > > > > I can say from the features I am involved with (FLIP-27,
> > > > > FLIP-115,
> > > > > > > > > > > > reviewing some state backend improvements, etc.) that 
> > > > > > > > > > > > it would
> > > > > be
> > > > > > > > > > > > helpful
> > > > > > > > > > > > to have two additional weeks.
> > > > > > > > > > > >
> > > > > > > > > > > > When looking at various other feature threads, my 
> > > > > > > > > > > > feeling is
> > > > > that
> > > > > > > > there
> > > > > > > > > > > are
> > > > > > > > > > > > more contributors and committers that could use a few 
> > > > > > > > > > > > more days.
> > > > > > > > > > > > The last two months were quite exceptional in and we 
> > > > > > > > > > > > did lose a
> > > > > > bit
> > > > > > > of
> > > > > > > > > > > > development speed here and there.
> > > > > > > > > > > >
> > > > > > > > > > > > How do you think about making *May 15th* the feature 
> > > > > > > > > > > > freeze?
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Stephan
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Best, Jingsong Lee
> > >
> >



[jira] [Created] (FLINK-17315) UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointMassivelyParallel failed in timeout

2020-04-21 Thread Zhijiang (Jira)
Zhijiang created FLINK-17315:


 Summary: 
UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointMassivelyParallel 
failed in timeout
 Key: FLINK-17315
 URL: https://issues.apache.org/jira/browse/FLINK-17315
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing, Tests
Reporter: Zhijiang
 Fix For: 1.11.0


Build: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=1=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=45cc9205-bdb7-5b54-63cd-89fdc0983323]

logs
{code:java}
2020-04-21T20:25:23.1139147Z [ERROR] Errors: 
2020-04-21T20:25:23.1140908Z [ERROR]   
UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointMassivelyParallel:80->execute:87
 » TestTimedOut
2020-04-21T20:25:23.1141383Z [INFO] 
2020-04-21T20:25:23.1141675Z [ERROR] Tests run: 1525, Failures: 0, Errors: 1, 
Skipped: 36
{code}
 
I run it in my local machine and it almost takes about 40 seconds to finish, so 
the configured 90 seconds timeout seems not enough in heavy load environment 
sometimes. Maybe we can remove the timeout in tests since azure already 
configured to monitor the timeout.
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] New Apache Flink PMC Member - Hequn Chen

2020-04-19 Thread Zhijiang
Congratulations, Hequn!

Best,
Zhijiang


--
From:Yun Gao 
Send Time:2020 Apr. 19 (Sun.) 21:53
To:dev 
Subject:Re: [ANNOUNCE] New Apache Flink PMC Member - Hequn Chen

   Congratulations Hequn!

   Best,
Yun


--
From:Hequn Cheng 
Send Time:2020 Apr. 18 (Sat.) 12:48
To:dev 
Subject:Re: [ANNOUNCE] New Apache Flink PMC Member - Hequn Chen

Many thanks for your support. Thank you!

Best,
Hequn

On Sat, Apr 18, 2020 at 1:27 AM Jacky Bai  wrote:

> Congratulations!Hequn Chen.I hope to make so many contributions to Flink
> like you.
>
> Best
> Bai Xu
>
> Congxian Qiu  于2020年4月17日周五 下午10:47写道:
>
> > Congratulations, Hequn!
> >
> > Best,
> > Congxian
> >
> >
> > Yu Li  于2020年4月17日周五 下午9:36写道:
> >
> > > Congratulations, Hequn!
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Fri, 17 Apr 2020 at 21:22, Kurt Young  wrote:
> > >
> > > > Congratulations Hequn!
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > > >
> > > > On Fri, Apr 17, 2020 at 8:57 PM Till Rohrmann 
> > > > wrote:
> > > >
> > > > > Congratulations Hequn!
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Fri, Apr 17, 2020 at 2:49 PM Shuo Cheng 
> > wrote:
> > > > >
> > > > > > Congratulations, Hequn
> > > > > >
> > > > > > Best,
> > > > > > Shuo
> > > > > >
> > > > > > On 4/17/20, hufeih...@mails.ucas.ac.cn <
> hufeih...@mails.ucas.ac.cn
> > >
> > > > > wrote:
> > > > > > > Congratulations , Hequn
> > > > > > >
> > > > > > > Best wish
> > > > > > >
> > > > > > >
> > > > > > > hufeih...@mails.ucas.ac.cn
> > > > > > > Congratulations, Hequn!
> > > > > > >
> > > > > > > Paul Lam  于2020年4月17日周五 下午3:02写道:
> > > > > > >
> > > > > > >> Congrats Hequn! Thanks a lot for your contribution to the
> > > community!
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Paul Lam
> > > > > > >>
> > > > > > >> Dian Fu  于2020年4月17日周五 下午2:58写道:
> > > > > > >>
> > > > > > >> > Congratulations, Hequn!
> > > > > > >> >
> > > > > > >> > > 在 2020年4月17日,下午2:36,Becket Qin  写道:
> > > > > > >> > >
> > > > > > >> > > Hi all,
> > > > > > >> > >
> > > > > > >> > > I am glad to announce that Hequn Chen has joined the Flink
> > > PMC.
> > > > > > >> > >
> > > > > > >> > > Hequn has contributed to Flink for years. He has worked on
> > > > several
> > > > > > >> > > components including Table / SQL,PyFlink and Flink ML
> > > Pipeline.
> > > > > > >> Besides,
> > > > > > >> > > Hequn is also very active in the community since the
> > > beginning.
> > > > > > >> > >
> > > > > > >> > > Congratulations, Hequn! Looking forward to your future
> > > > > > contributions.
> > > > > > >> > >
> > > > > > >> > > Thanks,
> > > > > > >> > >
> > > > > > >> > > Jiangjie (Becket) Qin
> > > > > > >> > > (On behalf of the Apache Flink PMC)
> > > > > > >> >
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best Regards
> > > > > > >
> > > > > > > Jeff Zhang
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>




Re: [VOTE] FLIP-118: Improve Flink’s ID system

2020-04-16 Thread Zhijiang
Thanks for this FLIP, Yangze. 

Sorry for not involving in the previous discussion. In general I like the 
proposed direction to make related IDs have more rich information for debugging 
and correlation.  

But I have a bit reminder for the changes. After failover restarting, every 
related IDs would be generated differently with before by random way, and I am 
not sure whether
this was intentional design before, but it is the ground truth now.

 E.g. ExecutionAttemptID , and the ResultPartitionID is also derived from it to 
guarantee the uniqueness after failover. Based on current implementation, it 
seems that we will not store and rely on the previous ID states after failover, 
but I am not sure whether this assumption is valid for future features.

Anyway, it is lucky that the proposed changes in this FLIP do not break the 
previous truth by introducing the `attemptNumber` in `ExecutionAttemptID`, so 
we do not need to further consider this issue now.

+1 (binding).

Best,
Zhijiang


--
From:Till Rohrmann 
Send Time:2020 Apr. 16 (Thu.) 21:39
To:dev 
Subject:Re: [VOTE] FLIP-118: Improve Flink’s ID system

Thanks for creating this FLIP, Yangze.

+1 (binding).

Cheers,
Till

On Thu, Apr 16, 2020 at 10:51 AM Yangze Guo  wrote:

> Hi everyone,
>
> I'd like to start the vote of FLIP-118 [1], which improves the log
> readability by adding more information to Flink’s IDs. This FLIP is
> discussed in the thread[2].
>
> The vote will be open for at least 72 hours. Unless there is an objection,
> I will try to close it by April 21, 2020 10:00 UTC if we have received
> sufficient votes.
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=148643521
> [2]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-118-Improve-Flink-s-ID-system-td39321.html
>
> Best,
> Yangze Guo
>



[jira] [Created] (FLINK-17095) KafkaProducerExactlyOnceITCase fails with "address already in use"

2020-04-12 Thread Zhijiang (Jira)
Zhijiang created FLINK-17095:


 Summary: KafkaProducerExactlyOnceITCase fails with "address 
already in use"
 Key: FLINK-17095
 URL: https://issues.apache.org/jira/browse/FLINK-17095
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka, Tests
Reporter: Zhijiang
 Fix For: 1.11.0


Logs: [https://travis-ci.org/github/apache/flink/jobs/673786814]
{code:java}
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 7.256 s 
<<< FAILURE! - in 
org.apache.flink.streaming.connectors.kafka.KafkaProducerExactlyOnceITCase
[ERROR] 
org.apache.flink.streaming.connectors.kafka.KafkaProducerExactlyOnceITCase  
Time elapsed: 7.256 s  <<< ERROR!
org.apache.kafka.common.KafkaException: Socket server failed to bind to 
0.0.0.0:42733: Address already in use.
at kafka.network.Acceptor.openServerSocket(SocketServer.scala:573)
at kafka.network.Acceptor.(SocketServer.scala:451)
at kafka.network.SocketServer.createAcceptor(SocketServer.scala:245)
at 
kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1(SocketServer.scala:215)
at 
kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1$adapted(SocketServer.scala:214)
at 
scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:58)
at 
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:51)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at 
kafka.network.SocketServer.createDataPlaneAcceptorsAndProcessors(SocketServer.scala:214)
at kafka.network.SocketServer.startup(SocketServer.scala:114)
at kafka.server.KafkaServer.startup(KafkaServer.scala:253)
at 
org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironmentImpl.getKafkaServer(KafkaTestEnvironmentImpl.java:404)
at 
org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironmentImpl.prepare(KafkaTestEnvironmentImpl.java:131)
at 
org.apache.flink.streaming.connectors.kafka.KafkaTestBase.startClusters(KafkaTestBase.java:142)
at 
org.apache.flink.streaming.connectors.kafka.KafkaTestBase.startClusters(KafkaTestBase.java:131)
at 
org.apache.flink.streaming.connectors.kafka.KafkaTestBase.prepare(KafkaTestBase.java:100)
at 
org.apache.flink.streaming.connectors.kafka.KafkaTestBase.prepare(KafkaTestBase.java:92)
at 
org.apache.flink.streaming.connectors.kafka.KafkaProducerExactlyOnceITCase.prepare(KafkaProducerExactlyOnceITCase.java:31)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:220)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:85)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:78)
at kafka.netwo

[jira] [Created] (FLINK-17094) OverWindowITCase#testRowTimeBoundedPartitionedRowsOver failed by FileNotFoundException

2020-04-12 Thread Zhijiang (Jira)
Zhijiang created FLINK-17094:


 Summary: OverWindowITCase#testRowTimeBoundedPartitionedRowsOver 
failed by FileNotFoundException
 Key: FLINK-17094
 URL: https://issues.apache.org/jira/browse/FLINK-17094
 Project: Flink
  Issue Type: Bug
  Components: Runtime / State Backends, Tests
Reporter: Zhijiang
 Fix For: 1.11.0


Build: [https://travis-ci.org/github/apache/flink/jobs/673786805]

logs
{code:java}
[ERROR] 
testRowTimeBoundedPartitionedRowsOver[StateBackend=ROCKSDB](org.apache.flink.table.planner.runtime.stream.sql.OverWindowITCase)
  Time elapsed: 0.754 s  <<< ERROR!
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147)
at 
org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:659)
at 
org.apache.flink.streaming.util.TestStreamEnvironment.execute(TestStreamEnvironment.java:77)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1643)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1625)
at 
org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.scala:673)
at 
org.apache.flink.table.planner.runtime.stream.sql.OverWindowITCase.testRowTimeBoundedPartitionedRowsOver(OverWindowITCase.scala:417)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter

[jira] [Created] (FLINK-17092) Pyflink failure for BlinkStreamDependencyTests and StreamPandasUDFITTests

2020-04-11 Thread Zhijiang (Jira)
Zhijiang created FLINK-17092:


 Summary: Pyflink failure for BlinkStreamDependencyTests and 
StreamPandasUDFITTests
 Key: FLINK-17092
 URL: https://issues.apache.org/jira/browse/FLINK-17092
 Project: Flink
  Issue Type: Bug
  Components: API / Python, Tests
Reporter: Zhijiang
 Fix For: 1.11.0


Build: 
[https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7324=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=14487301-07d2-5d56-5690-6dfab9ffd4d9]

logs
{code:java}
2020-04-10T13:05:25.7259119Z E   : 
java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
2020-04-10T13:05:25.7259755Z E  at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
2020-04-10T13:05:25.7260301Z E  at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
2020-04-10T13:05:25.7260927Z E  at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1663)
2020-04-10T13:05:25.7261772Z E  at 
org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:74)
2020-04-10T13:05:25.7262405Z E  at 
org.apache.flink.table.planner.delegation.ExecutorBase.execute(ExecutorBase.java:51)
2020-04-10T13:05:25.7263073Z E  at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.execute(TableEnvironmentImpl.java:719)
2020-04-10T13:05:25.7263588Z E  at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2020-04-10T13:05:25.7264090Z E  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2020-04-10T13:05:25.7264668Z E  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2020-04-10T13:05:25.7265175Z E  at 
java.lang.reflect.Method.invoke(Method.java:498)
2020-04-10T13:05:25.7265807Z E  at 
org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
2020-04-10T13:05:25.7266445Z E  at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
2020-04-10T13:05:25.7267288Z E  at 
org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282)
2020-04-10T13:05:25.7267897Z E  at 
org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
2020-04-10T13:05:25.7268518Z E  at 
org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
2020-04-10T13:05:25.7269130Z E  at 
org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
2020-04-10T13:05:25.7269623Z E  at 
java.lang.Thread.run(Thread.java:748)
2020-04-10T13:05:25.7270112Z E   Caused by: 
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
2020-04-10T13:05:25.7270700Z E  at 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147)
2020-04-10T13:05:25.7271406Z E  at 
org.apache.flink.client.program.PerJobMiniClusterFactory$PerJobMiniClusterJobClient.lambda$getJobExecutionResult$2(PerJobMiniClusterFactory.java:175)
2020-04-10T13:05:25.7272111Z E  at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
2020-04-10T13:05:25.7272665Z E  at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
2020-04-10T13:05:25.7273245Z E  at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
2020-04-10T13:05:25.7273909Z E  at 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
2020-04-10T13:05:25.7274514Z E  at 
org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:229)
2020-04-10T13:05:25.7275147Z E  at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
2020-04-10T13:05:25.7275800Z E  at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
2020-04-10T13:05:25.7276447Z E  at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
2020-04-10T13:05:25.7277239Z E  at 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975

Re: [DISCUSS] Creating a new repo to host Flink benchmarks

2020-04-09 Thread Zhijiang
+1 for the proposal.

Best,
Zhijiang 


--
From:Robert Metzger 
Send Time:2020 Apr. 10 (Fri.) 02:15
To:dev 
Subject:Re: [DISCUSS] Creating a new repo to host Flink benchmarks

+1 on creating the repo.


On Thu, Apr 9, 2020 at 5:54 PM Till Rohrmann  wrote:

> I think it is a good idea to make the benchmarks available to the community
> via a repo under the Apache project and to make updating it part of the
> release process. Hence +1 for the proposal.
>
> Cheers,
> Till
>
> On Thu, Apr 9, 2020 at 4:01 PM Piotr Nowojski  wrote:
>
> > Hi Yun Tang,
> >
> > Thanks for proposing the idea. Since we can not include benchmarks in the
> > Flink repository what you are proposing is the second best option.
> >
> > +1 from my side for the proposal.
> >
> > I think benchmarks have proven their value to justify this.
> >
> > Piotrek
> >
> > > On 9 Apr 2020, at 08:56, Yun Tang  wrote:
> > >
> > > Hi Flink devs,
> > >
> > > As Flink develops rapidly with more and more features added, how to
> > ensure no performance regression existed has become more and more
> > important. And we would like to create a new repo under apache project to
> > host previous flink-benchmarks [1] repo, which is inspired when we
> discuss
> > under FLINK-16850 [2]
> > >
> > > Some background context on flink-benchmarks, for those who are not
> > familiar with the project yet:
> > >
> > > - Current flink-benchmarks does not align with the Flink release, which
> > lead developers not easy to verify
> > >   performance at specific Flink version because current
> flink-benchmarks
> > always depends on the latest interfaces.
> > > - Above problem could be solved well if we could ensure
> flink-benchmarks
> > also create release branch when we
> > >   releasing Flink. However, current flink-benchmarks repo is hosted
> > under dataArtisans (the former name of
> > >   ververica) project, which is not involved in Flink release manual
> [3].
> > We propose to promote this repo under
> > >   apache project so that release manager could have the right to
> release
> > on flink-benchmarks.
> > > - The reason why we not involve flink-benchmarks into the apache/flink
> > repo is because it heavily depends on
> > >   JMH [4], which is under GPLv2 license.
> > >
> > > What do you think?
> > >
> > > Best,
> > > Yun Tang
> > >
> > > [1] https://github.com/dataArtisans/flink-benchmarks
> > > [2] https://issues.apache.org/jira/browse/FLINK-16850
> > > [3]
> >
> https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release
> > > [4] https://openjdk.java.net/projects/code-tools/jmh/
> > >
> > >
> >
> >
>



Re: Configuring autolinks to Flink JIRA ticket in github repos

2020-04-09 Thread Zhijiang
Very nice work! Thanks Yun finding this feature and making it happen!

Best,
Zhijiang


--
From:Xingbo Huang 
Send Time:2020 Apr. 9 (Thu.) 20:23
To:dev 
Subject:Re: Configuring autolinks to Flink JIRA ticket in github repos

Thanks Yun,

Good Work. It is very convenient to link to JIRA from corresponding PR
currently.

Best,
Xingbo

Hequn Cheng  于2020年4月9日周四 下午8:16写道:

> It’s much more convenient now. Thanks you!
>
> > On Apr 9, 2020, at 8:01 PM, Aljoscha Krettek 
> wrote:
> >
> > That is very nice! Thanks for taking care of this ~3q
> >
> > On 09.04.20 11:08, Dian Fu wrote:
> >> Cool! Thanks Yun for this effort. Very useful feature.
> >> Regards,
> >> Dian
> >>> 在 2020年4月9日,下午4:32,Yu Li  写道:
> >>>
> >>> Great! Thanks for the efforts Yun.
> >>>
> >>> Best Regards,
> >>> Yu
> >>>
> >>>
> >>> On Thu, 9 Apr 2020 at 16:15, Jark Wu  wrote:
> >>>
> >>>> Thanks Yun,
> >>>>
> >>>> This's a great feature! I was surprised by the autolink feature
> yesterday
> >>>> (didn't know your work at that time).
> >>>>
> >>>> Best,
> >>>> Jark
> >>>>
> >>>> On Thu, 9 Apr 2020 at 16:12, Yun Tang  wrote:
> >>>>
> >>>>> Hi community
> >>>>>
> >>>>> The autolink to Flink JIRA ticket has taken effect. You could refer
> to
> >>>> the
> >>>>> commit details page[1] to see all Flink JIRA titles within commits
> has
> >>>> the
> >>>>> hyper link underline. Moreover, you don't need to use markdown
> language
> >>>> to
> >>>>> create hyper link to Flink JIRA ticket when discussing in the pull
> >>>>> requests. e.g FLINK-16850 could point to the link instead of
> >>>> [FLINK-16850](
> >>>>> https://issues.apache.org/jira/browse/FLINK-16850)
> >>>>>
> >>>>>
> >>>>> [1] https://github.com/apache/flink/commits/master
> >>>>>
> >>>>> Best
> >>>>> Yun Tang
> >>>>>
> >>>>> 
> >>>>> From: Till Rohrmann 
> >>>>> Sent: Thursday, April 2, 2020 23:11
> >>>>> To: dev 
> >>>>> Subject: Re: Configuring autolinks to Flink JIRA ticket in github
> repos
> >>>>>
> >>>>> Nice, this is a cool feature. Thanks for asking INFRA for it.
> >>>>>
> >>>>> Cheers,
> >>>>> Till
> >>>>>
> >>>>> On Wed, Apr 1, 2020 at 6:52 PM Yun Tang  wrote:
> >>>>>
> >>>>>> Hi community.
> >>>>>>
> >>>>>> I noticed that Github supports autolink reference recently [1].
> This is
> >>>>>> helpful to allow developers could open Jira ticket link from pull
> >>>>> requests
> >>>>>> title directly when accessing github repo.
> >>>>>>
> >>>>>> I have already created INFRA-20055 [2] to ask for configuration for
> >>>> seven
> >>>>>> Flink related github repos. Hope it could be resolved soon 
> >>>>>>
> >>>>>>
> >>>>>> [1]
> >>>>>>
> >>>>>
> >>>>
> https://help.github.com/en/github/administering-a-repository/configuring-autolinks-to-reference-external-resources
> >>>>>> [2] https://issues.apache.org/jira/browse/INFRA-20055
> >>>>>>
> >>>>>> Best
> >>>>>> Yun Tang
> >>>>>>
> >>>>>
> >>>>
> >
>
>



Re: [ANNOUNCE] Apache Flink Stateful Functions 2.0.0 released

2020-04-09 Thread Zhijiang
Great work! Thanks Gordon for the continuous efforts for enhancing stateful 
functions and the efficient release!
Wish stateful functions becoming more and more popular in users.

Best,
Zhijiang


--
From:Yun Tang 
Send Time:2020 Apr. 9 (Thu.) 00:17
To:Till Rohrmann ; dev 
Cc:Oytun Tez ; user 
Subject:Re: [ANNOUNCE] Apache Flink Stateful Functions 2.0.0 released

Excited to see the stateful functions release!
Thanks for the great work of manager Gordon and everyone who ever contributed 
to this.

Best
Yun Tang

From: Till Rohrmann 
Sent: Wednesday, April 8, 2020 14:30
To: dev 
Cc: Oytun Tez ; user 
Subject: Re: [ANNOUNCE] Apache Flink Stateful Functions 2.0.0 released

Great news! Thanks a lot for being our release manager Gordon and to everyone 
who helped with the release.

Cheers,
Till

On Wed, Apr 8, 2020 at 3:57 AM Congxian Qiu 
mailto:qcx978132...@gmail.com>> wrote:
Thanks a lot for the release and your great job, Gordon!
Also thanks to everyone who made this release possible!

Best,
Congxian


Oytun Tez mailto:oy...@motaword.com>> 于2020年4月8日周三 上午2:55写道:

> I should also add, I couldn't agree more with this sentence in the release
> article: "state access/updates and messaging need to be integrated."
>
> This is something we strictly enforce in our Flink case, where we do not
> refer to anything external for storage, use Flink as our DB.
>
>
>
>  --
>
> [image: MotaWord]
> Oytun Tez
> M O T A W O R D | CTO & Co-Founder
> oy...@motaword.com<mailto:oy...@motaword.com>
>
>   <https://www.motaword.com/blog>
>
>
> On Tue, Apr 7, 2020 at 12:26 PM Oytun Tez 
> mailto:oy...@motaword.com>> wrote:
>
>> Great news! Thank you all.
>>
>> On Tue, Apr 7, 2020 at 12:23 PM Marta Paes Moreira 
>> mailto:ma...@ververica.com>>
>> wrote:
>>
>>> Thank you for managing the release, Gordon — you did a tremendous job!
>>> And to everyone else who worked on pushing it through.
>>>
>>> Really excited about the new use cases that StateFun 2.0 unlocks for
>>> Flink users and beyond!
>>>
>>>
>>> Marta
>>>
>>> On Tue, Apr 7, 2020 at 4:47 PM Hequn Cheng 
>>> mailto:he...@apache.org>> wrote:
>>>
>>>> Thanks a lot for the release and your great job, Gordon!
>>>> Also thanks to everyone who made this release possible!
>>>>
>>>> Best,
>>>> Hequn
>>>>
>>>> On Tue, Apr 7, 2020 at 8:58 PM Tzu-Li (Gordon) Tai 
>>>> mailto:tzuli...@apache.org>>
>>>> wrote:
>>>>
>>>>> The Apache Flink community is very happy to announce the release of
>>>>> Apache Flink Stateful Functions 2.0.0.
>>>>>
>>>>> Stateful Functions is an API that simplifies building distributed
>>>>> stateful applications.
>>>>> It's based on functions with persistent state that can interact
>>>>> dynamically with strong consistency guarantees.
>>>>>
>>>>> Please check out the release blog post for an overview of the release:
>>>>> https://flink.apache.org/news/2020/04/07/release-statefun-2.0.0.html
>>>>>
>>>>> The release is available for download at:
>>>>> https://flink.apache.org/downloads.html
>>>>>
>>>>> Maven artifacts for Stateful Functions can be found at:
>>>>> https://search.maven.org/search?q=g:org.apache.flink%20statefun
>>>>>
>>>>> Python SDK for Stateful Functions published to the PyPI index can be
>>>>> found at:
>>>>> https://pypi.org/project/apache-flink-statefun/
>>>>>
>>>>> Official Docker image for building Stateful Functions applications is
>>>>> currently being published to Docker Hub.
>>>>> Dockerfiles for this release can be found at:
>>>>> https://github.com/apache/flink-statefun-docker/tree/master/2.0.0
>>>>> Progress for creating the Docker Hub repository can be tracked at:
>>>>> https://github.com/docker-library/official-images/pull/7749
>>>>>
>>>>> The full release notes are available in Jira:
>>>>>
>>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12346878
>>>>>
>>>>> We would like to thank all contributors of the Apache Flink community
>>>>> who made this release possible!
>>>>>
>>>>> Cheers,
>>>>> Gordon
>>>>>
>>>> --
>>  --
>>
>> [image: MotaWord]
>> Oytun Tez
>> M O T A W O R D | CTO & Co-Founder
>> oy...@motaword.com<mailto:oy...@motaword.com>
>>
>>   <https://www.motaword.com/blog>
>>
>



Re: [ANNOUNCE] New Flink committer: Seth Wiesman

2020-04-07 Thread Zhijiang
Congratulations, Seth!

Best,
Zhijiang


--
From:tison 
Send Time:2020 Apr. 7 (Tue.) 19:26
To:dev 
Subject:Re: [ANNOUNCE] New Flink committer: Seth Wiesman

Congratulations, Seth!

Best,
tison.


Yu Li  于2020年4月7日周二 下午6:57写道:

> Congratulations, Seth!
>
> Best Regards,
> Yu
>
>
> On Tue, 7 Apr 2020 at 18:16, Benchao Li  wrote:
>
> > Congratulations~
> >
> > Hequn Cheng  于2020年4月7日周二 下午5:22写道:
> >
> > > Congratulations Seth!
> > >
> > > Best, Hequn
> > >
> > > On Tue, Apr 7, 2020 at 4:11 PM Fabian Hueske 
> wrote:
> > >
> > > > Congrats Seth! Well deserved :-)
> > > >
> > > > Cheers, Fabian
> > > >
> > > > Am Di., 7. Apr. 2020 um 10:09 Uhr schrieb Yangze Guo <
> > karma...@gmail.com
> > > >:
> > > >
> > > > > Congratulations Seth!
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Tue, Apr 7, 2020 at 4:07 PM Jiayi Liao  >
> > > > wrote:
> > > > > >
> > > > > > >
> > > > > > > Congratulations Seth :)
> > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> >
> > Benchao Li
> > School of Electronics Engineering and Computer Science, Peking University
> > Tel:+86-15650713730
> > Email: libenc...@gmail.com; libenc...@pku.edu.cn
> >
>



Re: [VOTE] FLIP-119: Pipelined Region Scheduling

2020-04-02 Thread Zhijiang
+1 (binding)

Best,
Zhijiang


--
From:Till Rohrmann 
Send Time:2020 Apr. 2 (Thu.) 23:09
To:dev 
Cc:zhuzh 
Subject:Re: [VOTE] FLIP-119: Pipelined Region Scheduling

+1

Cheers,
Till

On Tue, Mar 31, 2020 at 5:52 PM Gary Yao  wrote:

> Hi all,
>
> I would like to start the vote for FLIP-119 [1], which is discussed and
> reached a consensus in the discussion thread [2].
>
> The vote will be open until April 3 (72h) unless there is an objection
> or not enough votes.
>
> Best,
> Gary
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-119+Pipelined+Region+Scheduling
> [2]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-119-Pipelined-Region-Scheduling-tp39350.html
>



Re: [VOTE] FLIP-102: Add More Metrics to TaskManager

2020-04-01 Thread Zhijiang
Thanks for the FLIP, Yadong. In general I think this work is valuable for users 
to better understand the Flink's memory usages in different dimensions. 

Sorry for not going through every detailed discussions below, and I try to do 
that later if possible. Firstly I try to answer some Andrey's concerns with 
mmap.

> - I do not know how the mapped memory works. Is it meant for the new spilled 
> partitions? If the mapped memory also pulls from the direct
> memory limit then this is something we do not account in our network buffers 
> as I understand. In this case, this metric may be useful for tuning to 
> understand
> how much the mapped memory uses from the direct memory limit to set e.g. 
> framework off-heap limit correctly and avoid direct OOM.
> It could be something to discuss with Zhijiang. e.g. is the direct memory 
> used there to buffer fetched regions of partition files or what for?

Yes, the mapped memory is used in bounded blocking partition for batch jobs 
now, but not the default mode.

 AIK it is not related and limited to the setting of `MaxDirectMemory`, so we 
do not need to worry about the current direct memory setting and the potential 
OOM issue.
It is up to the address space to determine the mapped file size, and in 64 bit 
system we can regard the limitless size in theory.

Regarding the size of mapped buffer pool from MXBean, it only indicates how 
much file size were already mapped before, even it is unchanged to not reflect 
the real
physical memory use. E.g. when the file was mapped 100GB region at the 
beginning, the mapped buffer pool from MXBean would be 100GB. But how many 
physical
memories are really consumed is up to the specific read or write operations in 
practice, and also controlled by the operator system. E.g some unused regions 
might be
exchanged into SWAP virtual memory when physical memory is limited. 

From this point, I guess it is no meaningful to show the size of mapped buffer 
pool for users who may be more concerned with how many physical memories are 
really
used.

Best,
Zhijiang


--
From:Andrey Zagrebin 
Send Time:2020 Mar. 30 (Mon.) 22:56
To:dev 
Subject:Re: [VOTE] FLIP-102: Add More Metrics to TaskManager

Hi All,

Thanks for this FLIP, Yadong. This is a very good improvement to the
Flink's UI.
It looks like there are still couple of things to resolve before the final
vote.

- I also find the non-heap title in configuration confusing because there
are also other non-heap types of memory. The "off-heap" concept is quite
broad.
What about "JVM specific" meaning that it is not coming directly from Flink?
or we could remove the "Non-heap" box at all and show directly JVM
Metaspace and Overhead as separate boxes,
this would also fit if we decide to keep the Metaspace metric.

- Total Process Memory Used: I agree with Xintong, it is hard to say what
is used there.
Then the size of "Total Process Memory" basically becomes part of
configuration.

- Non-Heap Used/Max/.. Not sure what committed means here. I also think we
should either exclude it or display what is known for sure.
In general, the metaspace usage would be nice to have but it should be then
exactly metaspace usage without any thing else.

- I do not know how the mapped memory works. Is it meant for the new
spilled partitions? If the mapped memory also pulls from the direct
memory limit
then this is something we do not account in our network buffers as I
understand. In this case, this metric may be useful for tuning to understand
how much the mapped memory uses from the direct memory limit to set e.g.
framework off-heap limit correctly and avoid direct OOM.
It could be something to discuss with Zhijiang. e.g. is the direct
memory used there to buffer fetched regions of partition files or what for?

- Not sure, we need an extra wrapping box "other" for the managed memory
atm. I could be just "Managed" or "Managed by Flink".

Best,
Andrey

On Fri, Mar 27, 2020 at 6:13 AM Xintong Song  wrote:

> Sorry for the late response.
>
> I have shared my suggestions with Yadong & Lining offline. I think it would
> be better to also post them here, for the public record.
>
>- I'm not sure about displaying Total Process Memory Used. Currently, we
>do not have a good way to monitor all memory footprints of the process.
>Metrics for some native memory usages (e.g., thread stack) are absent.
>Displaying a partial used memory size could be confusing for users.
>- I would suggest merge the current Mapped Memory metrics into Direct
>Memory. Actually, the metrics are retrieved from MXBeans for direct
> buffer
>pool and mapped buffer pool. Both of the two pools are accounted for in
>-XX:MaxDirectMemorySize. There's no Flink configuration that can modify
> the
>ind

Re: [ANNOUNCE] Flink on Zeppelin (Zeppelin 0.9 is released)

2020-03-30 Thread Zhijiang

Thanks for the continuous efforts for engaging in Flink ecosystem Jeff!
Glad to see the progressive achievement. Wish more users try it out in practice.

Best,
Zhijiang



--
From:Dian Fu 
Send Time:2020 Mar. 31 (Tue.) 10:15
To:Jeff Zhang 
Cc:user ; dev 
Subject:Re: [ANNOUNCE] Flink on Zeppelin (Zeppelin 0.9 is released)

Hi Jeff,

Thanks for the great work and sharing it with the community! Very impressive 
and will try it out.

Regards,
Dian

在 2020年3月30日,下午9:16,Till Rohrmann  写道:
This is great news Jeff! Thanks a lot for sharing it with the community. 
Looking forward trying Flink on Zeppelin out :-)

Cheers,
Till
On Mon, Mar 30, 2020 at 2:47 PM Jeff Zhang  wrote:
Hi Folks,

I am very excited to announce the integration work of flink on apache zeppelin 
notebook is completed. You can now run flink jobs via datastream api, table 
api, sql, pyflink in apache apache zeppelin notebook. Download it here 
http://zeppelin.apache.org/download.html), 

Here's some highlights of this work

1. Support 3 kind of execution mode: local, remote, yarn
2. Support multiple languages  in one flink session: scala, python, sql
3. Support hive connector (reading from hive and writing to hive) 
4. Dependency management
5. UDF support (scala, pyflink)
6. Support both batch sql and streaming sql

For more details and usage instructions, you can refer following 4 blogs

1) Get started https://link.medium.com/oppqD6dIg5 2) Batch 
https://link.medium.com/3qumbwRIg5 3) Streaming 
https://link.medium.com/RBHa2lTIg5 4) Advanced usage 
https://link.medium.com/CAekyoXIg5

Welcome to use flink on zeppelin and give feedback and comments. 

-- 
Best Regards

Jeff Zhang



Re: [DISCUSS] Creating a new repo to host Stateful Functions Dockerfiles

2020-03-27 Thread Zhijiang
+1 for this proposal. Very reasonable analysis!

Best,
Zhijiang 


--
From:Hequn Cheng 
Send Time:2020 Mar. 27 (Fri.) 09:46
To:dev 
Cc:private 
Subject:Re: [DISCUSS] Creating a new repo to host Stateful Functions Dockerfiles

+1 for a separate repository.
The dedicated `flink-docker` repo works fine now. We can do it similarly.

Best,
Hequn

On Fri, Mar 27, 2020 at 1:16 AM Till Rohrmann  wrote:

> +1 for a separate repository.
>
> Cheers,
> Till
>
> On Thu, Mar 26, 2020 at 5:13 PM Ufuk Celebi  wrote:
>
> > +1.
> >
> > The repo creation process is a light-weight, automated process on the ASF
> > side. When Patrick Lucas contributed docker-flink back to the Flink
> > community (as flink-docker), there was virtually no overhead in creating
> > the repository. Reusing build scripts should still be possible at the
> cost
> > of some duplication which is fine imo.
> >
> > – Ufuk
> >
> > On Thu, Mar 26, 2020 at 4:18 PM Stephan Ewen  wrote:
> > >
> > > +1 to a separate repository.
> > >
> > > It seems to be best practice in the docker community.
> > > And since it does not add overhead, why not go with the best practice?
> > >
> > > Best,
> > > Stephan
> > >
> > >
> > > On Thu, Mar 26, 2020 at 4:15 PM Tzu-Li (Gordon) Tai <
> tzuli...@apache.org
> > >
> > wrote:
> > >>
> > >> Hi Flink devs,
> > >>
> > >> As part of a Stateful Functions release, we would like to publish
> > Stateful
> > >> Functions Docker images to Dockerhub as an official image.
> > >>
> > >> Some background context on Stateful Function images, for those who are
> > not
> > >> familiar with the project yet:
> > >>
> > >>- Stateful Function images are built on top of the Flink official
> > >>images, with additional StateFun dependencies being added.
> > >>You can take a look at the scripts we currently use to build the
> > images
> > >>locally for development purposes [1].
> > >>- They are quite important for user experience, since building a
> > Docker
> > >>image is the recommended go-to deployment mode for StateFun user
> > >>applications [2].
> > >>
> > >>
> > >> A prerequisite for all of this is to first decide where we host the
> > >> Stateful Functions Dockerfiles,
> > >> before we can proceed with the process of requesting a new official
> > image
> > >> repository at Dockerhub.
> > >>
> > >> We’re proposing to create a new dedicated repo for this purpose,
> > >> with the name `apache/flink-statefun-docker`.
> > >>
> > >> While we did initially consider integrating the StateFun Dockerfiles
> to
> > be
> > >> hosted together with the Flink ones in the existing
> > `apache/flink-docker`
> > >> repo, we had the following concerns:
> > >>
> > >>- In general, it is a convention that each official Dockerhub image
> > is
> > >>backed by a dedicated source repo hosting the Dockerfiles.
> > >>- The `apache/flink-docker` repo already has quite a few dedicated
> > >>tooling and CI smoke tests specific for the Flink images.
> > >>- Flink and StateFun have separate versioning schemes and
> independent
> > >>release cycles. A new Flink release does not necessarily require a
> > >>“lock-step” to release new StateFun images as well.
> > >>- Considering the above all-together, and the fact that creating a
> > new
> > >>repo is rather low-effort, having a separate repo would probably
> make
> > more
> > >>sense here.
> > >>
> > >>
> > >> What do you think?
> > >>
> > >> Cheers,
> > >> Gordon
> > >>
> > >> [1]
> > >>
> >
> >
> https://github.com/apache/flink-statefun/blob/master/tools/docker/build-stateful-functions.sh
> > >> [2]
> > >>
> >
> >
> https://ci.apache.org/projects/flink/flink-statefun-docs-master/deployment-and-operations/packaging.html
> >
>



[jira] [Created] (FLINK-16821) Run Kubernetes test failed with invalid named "minikube"

2020-03-26 Thread Zhijiang (Jira)
Zhijiang created FLINK-16821:


 Summary: Run Kubernetes test failed with invalid named "minikube"
 Key: FLINK-16821
 URL: https://issues.apache.org/jira/browse/FLINK-16821
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Kubernetes, Tests
Reporter: Zhijiang


This is the test run 
[https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6702=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5]

Log output
{code:java}
2020-03-27T00:07:38.9666021Z Running 'Run Kubernetes test'
2020-03-27T00:07:38.956Z 
==
2020-03-27T00:07:38.9677101Z TEST_DATA_DIR: 
/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-38967103614
2020-03-27T00:07:41.7529865Z Flink dist directory: 
/home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT
2020-03-27T00:07:41.7721475Z Flink dist directory: 
/home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT
2020-03-27T00:07:41.8208394Z Docker version 19.03.8, build afacb8b7f0
2020-03-27T00:07:42.4793914Z docker-compose version 1.25.4, build 8d51620a
2020-03-27T00:07:42.5359301Z Installing minikube ...
2020-03-27T00:07:42.5494076Z   % Total% Received % Xferd  Average Speed   
TimeTime Time  Current
2020-03-27T00:07:42.5494729Z  Dload  Upload   
Total   SpentLeft  Speed
2020-03-27T00:07:42.5498136Z 
2020-03-27T00:07:42.6214887Z   0 00 00 0  0  0 
--:--:-- --:--:-- --:--:-- 0
2020-03-27T00:07:43.3467750Z   0 00 00 0  0  0 
--:--:-- --:--:-- --:--:-- 0
2020-03-27T00:07:43.3469636Z 100 52.0M  100 52.0M0 0  65.2M  0 
--:--:-- --:--:-- --:--:-- 65.2M
2020-03-27T00:07:43.4262625Z * There is no local cluster named "minikube"
2020-03-27T00:07:43.4264438Z   - To fix this, run: minikube start
2020-03-27T00:07:43.4282404Z Starting minikube ...
2020-03-27T00:07:43.7749694Z * minikube v1.9.0 on Ubuntu 16.04
2020-03-27T00:07:43.7761742Z * Using the none driver based on user configuration
2020-03-27T00:07:43.7762229Z X The none driver requires conntrack to be 
installed for kubernetes version 1.18.0
2020-03-27T00:07:43.8202161Z * There is no local cluster named "minikube"
2020-03-27T00:07:43.8203353Z   - To fix this, run: minikube start
2020-03-27T00:07:43.8568899Z * There is no local cluster named "minikube"
2020-03-27T00:07:43.8570685Z   - To fix this, run: minikube start
2020-03-27T00:07:43.8583793Z Command: start_kubernetes_if_not_running failed. 
Retrying...
2020-03-27T00:07:48.9017252Z * There is no local cluster named "minikube"
2020-03-27T00:07:48.9019347Z   - To fix this, run: minikube start
2020-03-27T00:07:48.9031515Z Starting minikube ...
2020-03-27T00:07:49.0612601Z * minikube v1.9.0 on Ubuntu 16.04
2020-03-27T00:07:49.0616688Z * Using the none driver based on user configuration
2020-03-27T00:07:49.0620173Z X The none driver requires conntrack to be 
installed for kubernetes version 1.18.0
2020-03-27T00:07:49.1040676Z * There is no local cluster named "minikube"
2020-03-27T00:07:49.1042353Z   - To fix this, run: minikube start
2020-03-27T00:07:49.1453522Z * There is no local cluster named "minikube"
2020-03-27T00:07:49.1454594Z   - To fix this, run: minikube start
2020-03-27T00:07:49.1468436Z Command: start_kubernetes_if_not_running failed. 
Retrying...
2020-03-27T00:07:54.1907713Z * There is no local cluster named "minikube"
2020-03-27T00:07:54.1909876Z   - To fix this, run: minikube start
2020-03-27T00:07:54.1921479Z Starting minikube ...
2020-03-27T00:07:54.3388738Z * minikube v1.9.0 on Ubuntu 16.04
2020-03-27T00:07:54.3395499Z * Using the none driver based on user configuration
2020-03-27T00:07:54.3396443Z X The none driver requires conntrack to be 
installed for kubernetes version 1.18.0
2020-03-27T00:07:54.3824399Z * There is no local cluster named "minikube"
2020-03-27T00:07:54.3837652Z   - To fix this, run: minikube start
2020-03-27T00:07:54.4203902Z * There is no local cluster named "minikube"
2020-03-27T00:07:54.4204895Z   - To fix this, run: minikube start
2020-03-27T00:07:54.4217866Z Command: start_kubernetes_if_not_running failed. 
Retrying...
2020-03-27T00:07:59.4235917Z Command: start_kubernetes_if_not_running failed 3 
times.
2020-03-27T00:07:59.4236459Z Could not start minikube. Aborting...
2020-03-27T00:07:59.8439850Z The connection to the server localhost:8080 was 
refused - did you specify the right host or port?
2020-03-27T00:07:59.8939088Z The connection to the server localhost:8080 was 
refused - did you specify the right host or port?
2020-03-27T00:07:59.9515679Z The connection to the server localhost:8080 was 
refused - did you specify the

[jira] [Created] (FLINK-16770) Resuming Externalized Checkpoint (rocks, incremental, scale up) end-to-end test fails with no such file

2020-03-25 Thread Zhijiang (Jira)
Zhijiang created FLINK-16770:


 Summary: Resuming Externalized Checkpoint (rocks, incremental, 
scale up) end-to-end test fails with no such file
 Key: FLINK-16770
 URL: https://issues.apache.org/jira/browse/FLINK-16770
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing, Tests
Reporter: Zhijiang
 Fix For: 1.11.0


The log : 
[https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6603=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5]

 

There was also the similar problem in 
https://issues.apache.org/jira/browse/FLINK-16561, but for the case of no 
parallelism change. And this case is for scaling up. Not quite sure whether the 
root cause is the same one.
{code:java}
2020-03-25T06:50:31.3894841Z Running 'Resuming Externalized Checkpoint (rocks, 
incremental, scale up) end-to-end test'
2020-03-25T06:50:31.3895308Z 
==
2020-03-25T06:50:31.3907274Z TEST_DATA_DIR: 
/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-31390197304
2020-03-25T06:50:31.5500274Z Flink dist directory: 
/home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT
2020-03-25T06:50:31.6354639Z Starting cluster.
2020-03-25T06:50:31.8871932Z Starting standalonesession daemon on host fv-az655.
2020-03-25T06:50:33.5021784Z Starting taskexecutor daemon on host fv-az655.
2020-03-25T06:50:33.5152274Z Waiting for Dispatcher REST endpoint to come up...
2020-03-25T06:50:34.5498116Z Waiting for Dispatcher REST endpoint to come up...
2020-03-25T06:50:35.6031346Z Waiting for Dispatcher REST endpoint to come up...
2020-03-25T06:50:36.9848425Z Waiting for Dispatcher REST endpoint to come up...
2020-03-25T06:50:38.0283377Z Dispatcher REST endpoint is up.
2020-03-25T06:50:38.0285490Z Running externalized checkpoints test, with 
ORIGINAL_DOP=2 NEW_DOP=4 and STATE_BACKEND_TYPE=rocks 
STATE_BACKEND_FILE_ASYNC=true STATE_BACKEND_ROCKSDB_INCREMENTAL=true 
SIMULATE_FAILURE=false ...
2020-03-25T06:50:46.1754645Z Job (b8cb04e4b1e730585bc616aa352866d0) is running.
2020-03-25T06:50:46.1758132Z Waiting for job (b8cb04e4b1e730585bc616aa352866d0) 
to have at least 1 completed checkpoints ...
2020-03-25T06:50:46.3478276Z Waiting for job to process up to 200 records, 
current progress: 173 records ...
2020-03-25T06:50:49.6332988Z Cancelling job b8cb04e4b1e730585bc616aa352866d0.
2020-03-25T06:50:50.4875673Z Cancelled job b8cb04e4b1e730585bc616aa352866d0.
2020-03-25T06:50:50.5468230Z ls: cannot access 
'/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-31390197304/externalized-chckpt-e2e-backend-dir/b8cb04e4b1e730585bc616aa352866d0/chk-[1-9]*/_metadata':
 No such file or directory
2020-03-25T06:50:50.5606260Z Restoring job with externalized checkpoint at . ...
2020-03-25T06:50:58.4728245Z 
2020-03-25T06:50:58.4732663Z 

2020-03-25T06:50:58.4735785Z  The program finished with the following exception:
2020-03-25T06:50:58.4737759Z 
2020-03-25T06:50:58.4742666Z 
org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error: java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.client.JobSubmissionException: Failed to submit 
JobGraph.
2020-03-25T06:50:58.4746274Zat 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
2020-03-25T06:50:58.4749954Zat 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
2020-03-25T06:50:58.4752753Zat 
org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:142)
2020-03-25T06:50:58.4755400Zat 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:659)
2020-03-25T06:50:58.4757862Zat 
org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:210)
2020-03-25T06:50:58.4760282Zat 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:890)
2020-03-25T06:50:58.4763591Zat 
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:963)
2020-03-25T06:50:58.4764274Zat 
java.security.AccessController.doPrivileged(Native Method)
2020-03-25T06:50:58.4764809Zat 
javax.security.auth.Subject.doAs(Subject.java:422)
2020-03-25T06:50:58.4765434Zat 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
2020-03-25T06:50:58.4766180Zat 
org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
2020-03-25T06:50:58.4773549Zat 
org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:963)
2020-03-25T06:50:58.4774502Z Caused by: java.lang.RuntimeException: 
java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.client.JobSubmissionException

[jira] [Created] (FLINK-16768) HadoopS3RecoverableWriterITCase.testRecoverWithStateWithMultiPart runs without exit

2020-03-25 Thread Zhijiang (Jira)
Zhijiang created FLINK-16768:


 Summary: 
HadoopS3RecoverableWriterITCase.testRecoverWithStateWithMultiPart runs without 
exit
 Key: FLINK-16768
 URL: https://issues.apache.org/jira/browse/FLINK-16768
 Project: Flink
  Issue Type: Task
  Components: FileSystems, Tests
Reporter: Zhijiang
 Fix For: 1.11.0


Logs: 
[https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6584=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=d26b3528-38b0-53d2-05f7-37557c2405e4]
{code:java}
2020-03-24T15:52:18.9196862Z "main" #1 prio=5 os_prio=0 tid=0x7fd36c00b800 
nid=0xc21 runnable [0x7fd3743ce000]
2020-03-24T15:52:18.9197235Zjava.lang.Thread.State: RUNNABLE
2020-03-24T15:52:18.9197536Zat 
java.net.SocketInputStream.socketRead0(Native Method)
2020-03-24T15:52:18.9197931Zat 
java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
2020-03-24T15:52:18.9198340Zat 
java.net.SocketInputStream.read(SocketInputStream.java:171)
2020-03-24T15:52:18.9198749Zat 
java.net.SocketInputStream.read(SocketInputStream.java:141)
2020-03-24T15:52:18.9199171Zat 
sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
2020-03-24T15:52:18.9199840Zat 
sun.security.ssl.InputRecord.readV3Record(InputRecord.java:593)
2020-03-24T15:52:18.9200265Zat 
sun.security.ssl.InputRecord.read(InputRecord.java:532)
2020-03-24T15:52:18.9200663Zat 
sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
2020-03-24T15:52:18.9201213Z- locked <0x927583d8> (a 
java.lang.Object)
2020-03-24T15:52:18.9201589Zat 
sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
2020-03-24T15:52:18.9202026Zat 
sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
2020-03-24T15:52:18.9202583Z- locked <0x92758c00> (a 
sun.security.ssl.AppInputStream)
2020-03-24T15:52:18.9203029Zat 
org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
2020-03-24T15:52:18.9203558Zat 
org.apache.http.impl.io.SessionInputBufferImpl.read(SessionInputBufferImpl.java:198)
2020-03-24T15:52:18.9204121Zat 
org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:176)
2020-03-24T15:52:18.9204626Zat 
org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:135)
2020-03-24T15:52:18.9205121Zat 
com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
2020-03-24T15:52:18.9205679Zat 
com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180)
2020-03-24T15:52:18.9206164Zat 
com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
2020-03-24T15:52:18.9206786Zat 
com.amazonaws.services.s3.internal.S3AbortableInputStream.read(S3AbortableInputStream.java:125)
2020-03-24T15:52:18.9207361Zat 
com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
2020-03-24T15:52:18.9207839Zat 
com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
2020-03-24T15:52:18.9208327Zat 
com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
2020-03-24T15:52:18.9208809Zat 
com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180)
2020-03-24T15:52:18.9209273Zat 
com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
2020-03-24T15:52:18.9210003Zat 
com.amazonaws.util.LengthCheckInputStream.read(LengthCheckInputStream.java:107)
2020-03-24T15:52:18.9210658Zat 
com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
2020-03-24T15:52:18.9211154Zat 
org.apache.hadoop.fs.s3a.S3AInputStream.lambda$read$3(S3AInputStream.java:445)
2020-03-24T15:52:18.9211631Zat 
org.apache.hadoop.fs.s3a.S3AInputStream$$Lambda$42/1936375962.execute(Unknown 
Source)
2020-03-24T15:52:18.9212044Zat 
org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
2020-03-24T15:52:18.9212553Zat 
org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260)
2020-03-24T15:52:18.9212972Zat 
org.apache.hadoop.fs.s3a.Invoker$$Lambda$23/1457226878.execute(Unknown Source)
2020-03-24T15:52:18.9213408Zat 
org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317)
2020-03-24T15:52:18.9213866Zat 
org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:256)
2020-03-24T15:52:18.9214273Zat 
org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:231)
2020-03-24T15:52:18.9214701Zat 
org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:441)
2020-03-24T15:52:18.9215443Z- locked <0x926e88b0> (a 
org.apache.hadoop.fs.s3a.S3AInputStream)
2020-03-24T15:52:18.9215852Zat 
java.io.DataInputStream.read(DataInputStream.java:149)
2020-03-24T15:52:18.9216305Zat 
org.apache.flink.runtime.fs.hdfs.HadoopDataInputStream.read(HadoopDataInputStre

[jira] [Created] (FLINK-16750) Kerberized YARN on Docker test fails with staring Hadoop cluster

2020-03-24 Thread Zhijiang (Jira)
Zhijiang created FLINK-16750:


 Summary: Kerberized YARN on Docker test fails with staring Hadoop 
cluster
 Key: FLINK-16750
 URL: https://issues.apache.org/jira/browse/FLINK-16750
 Project: Flink
  Issue Type: Task
  Components: Deployment / Docker, Deployment / YARN, Tests
Reporter: Zhijiang


Build: 
[https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6563=results]

logs
{code:java}
2020-03-24T08:48:53.3813297Z 
==
2020-03-24T08:48:53.3814016Z Running 'Running Kerberized YARN on Docker test 
(custom fs plugin)'
2020-03-24T08:48:53.3814511Z 
==
2020-03-24T08:48:53.3827028Z TEST_DATA_DIR: 
/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-53382133956
2020-03-24T08:48:56.1944456Z Flink dist directory: 
/home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT
2020-03-24T08:48:56.2300265Z Flink dist directory: 
/home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT
2020-03-24T08:48:56.2412349Z Flink dist directory: 
/home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT
2020-03-24T08:48:56.2861072Z Docker version 19.03.8, build afacb8b7f0
2020-03-24T08:48:56.8025297Z docker-compose version 1.25.4, build 8d51620a
2020-03-24T08:48:56.8499071Z Flink Tarball directory 
/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-53382133956
2020-03-24T08:48:56.8501170Z Flink tarball filename flink.tar.gz
2020-03-24T08:48:56.8502612Z Flink distribution directory name 
flink-1.11-SNAPSHOT
2020-03-24T08:48:56.8504724Z End-to-end directory 
/home/vsts/work/1/s/flink-end-to-end-tests
2020-03-24T08:48:56.8620115Z Building Hadoop Docker container
2020-03-24T08:48:56.9117609Z Sending build context to Docker daemon  56.83kB
2020-03-24T08:48:56.9117926Z 
2020-03-24T08:48:57.0076373Z Step 1/54 : FROM sequenceiq/pam:ubuntu-14.04
2020-03-24T08:48:57.0082811Z  ---> df7bea4c5f64
2020-03-24T08:48:57.0084798Z Step 2/54 : RUN set -x && addgroup hadoop 
&& useradd -d /home/hdfs -ms /bin/bash -G hadoop -p hdfs hdfs && useradd -d 
/home/yarn -ms /bin/bash -G hadoop -p yarn yarn && useradd -d /home/mapred 
-ms /bin/bash -G hadoop -p mapred mapred && useradd -d /home/hadoop-user 
-ms /bin/bash -p hadoop-user hadoop-user
2020-03-24T08:48:57.0092833Z  ---> Using cache
2020-03-24T08:48:57.0093976Z  ---> 3c12a7d3e20c
2020-03-24T08:48:57.0096889Z Step 3/54 : RUN set -x && apt-get update && 
apt-get install -y curl tar sudo openssh-server openssh-client rsync unzip 
krb5-user
2020-03-24T08:48:57.0106188Z  ---> Using cache
2020-03-24T08:48:57.0107830Z  ---> 9a59599596be
2020-03-24T08:48:57.0110793Z Step 4/54 : RUN set -x && mkdir -p 
/var/log/kerberos && touch /var/log/kerberos/kadmind.log
2020-03-24T08:48:57.0118896Z  ---> Using cache
2020-03-24T08:48:57.0121035Z  ---> c83551d4f695
2020-03-24T08:48:57.0125298Z Step 5/54 : RUN set -x && rm -f 
/etc/ssh/ssh_host_dsa_key /etc/ssh/ssh_host_rsa_key /root/.ssh/id_rsa && 
ssh-keygen -q -N "" -t dsa -f /etc/ssh/ssh_host_dsa_key && ssh-keygen -q -N 
"" -t rsa -f /etc/ssh/ssh_host_rsa_key && ssh-keygen -q -N "" -t rsa -f 
/root/.ssh/id_rsa && cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys
2020-03-24T08:48:57.0133473Z  ---> Using cache
2020-03-24T08:48:57.0134240Z  ---> f69560c2bc0a
2020-03-24T08:48:57.0135683Z Step 6/54 : RUN set -x && mkdir -p 
/usr/java/default && curl -Ls 
'http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.tar.gz'
 -H 'Cookie: oraclelicense=accept-securebackup-cookie' | tar 
--strip-components=1 -xz -C /usr/java/default/
2020-03-24T08:48:57.0148145Z  ---> Using cache
2020-03-24T08:48:57.0149008Z  ---> f824256d72f1
2020-03-24T08:48:57.0152616Z Step 7/54 : ENV JAVA_HOME /usr/java/default
2020-03-24T08:48:57.0155992Z  ---> Using cache
2020-03-24T08:48:57.0160104Z  ---> 770e6bfd219a
2020-03-24T08:48:57.0160410Z Step 8/54 : ENV PATH $PATH:$JAVA_HOME/bin
2020-03-24T08:48:57.0168690Z  ---> Using cache
2020-03-24T08:48:57.0169451Z  ---> 2643e1a25898
2020-03-24T08:48:57.0174785Z Step 9/54 : RUN set -x && curl -LOH 'Cookie: 
oraclelicense=accept-securebackup-cookie' 
'http://download.oracle.com/otn-pub/java/jce/8/jce_policy-8.zip' && unzip 
jce_policy-8.zip && cp /UnlimitedJCEPolicyJDK8/local_policy.jar 
/UnlimitedJCEPolicyJDK8/US_export_policy.jar $JAVA_HOME/jre/lib/security
2020-03-24T08:48:57.0187797Z  ---> Using cache
2020-03-

[jira] [Created] (FLINK-16739) PrestoS3FileSystemITCase#testSimpleFileWriteAndRead fails with no such key

2020-03-23 Thread Zhijiang (Jira)
Zhijiang created FLINK-16739:


 Summary: PrestoS3FileSystemITCase#testSimpleFileWriteAndRead fails 
with no such key
 Key: FLINK-16739
 URL: https://issues.apache.org/jira/browse/FLINK-16739
 Project: Flink
  Issue Type: Task
  Components: Connectors / FileSystem, Tests
Reporter: Zhijiang


Build: 
[https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6546=logs=e9af9cde-9a65-5281-a58e-2c8511d36983=df5b2bf5-bcff-5dc9-7626-50bed0866a82]

logs
{code:java}
2020-03-24T01:51:19.6988685Z [INFO] Running 
org.apache.flink.fs.s3presto.PrestoS3FileSystemBehaviorITCase
2020-03-24T01:51:21.6250893Z [INFO] Running 
org.apache.flink.fs.s3presto.PrestoS3FileSystemITCase
2020-03-24T01:51:25.1626385Z [WARNING] Tests run: 8, Failures: 0, Errors: 0, 
Skipped: 2, Time elapsed: 5.461 s - in 
org.apache.flink.fs.s3presto.PrestoS3FileSystemBehaviorITCase
2020-03-24T01:51:50.5503712Z [ERROR] Tests run: 7, Failures: 1, Errors: 1, 
Skipped: 0, Time elapsed: 28.922 s <<< FAILURE! - in 
org.apache.flink.fs.s3presto.PrestoS3FileSystemITCase
2020-03-24T01:51:50.5506010Z [ERROR] testSimpleFileWriteAndRead[Scheme = 
s3p](org.apache.flink.fs.s3presto.PrestoS3FileSystemITCase)  Time elapsed: 0.7 
s  <<< ERROR!
2020-03-24T01:51:50.5513057Z 
com.facebook.presto.hive.s3.PrestoS3FileSystem$UnrecoverableS3OperationException:
 com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not 
exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request 
ID: A07D70A474EABC13; S3 Extended Request ID: 
R2ReW39oZ9ncoc82xb+V5h/EJV5/Mnsee+7uZ7cFMkliTQ/nKhvHPCDfr5zddbfUdR/S49VdbrA=), 
S3 Extended Request ID: 
R2ReW39oZ9ncoc82xb+V5h/EJV5/Mnsee+7uZ7cFMkliTQ/nKhvHPCDfr5zddbfUdR/S49VdbrA= 
(Path: s3://***/temp/tests-c79a578b-13d9-41ba-b73b-4f53fc965b96/test.txt)
2020-03-24T01:51:50.5517642Z Caused by: 
com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not 
exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request 
ID: A07D70A474EABC13; S3 Extended Request ID: 
R2ReW39oZ9ncoc82xb+V5h/EJV5/Mnsee+7uZ7cFMkliTQ/nKhvHPCDfr5zddbfUdR/S49VdbrA=)
2020-03-24T01:51:50.5519791Z 
2020-03-24T01:51:50.5520679Z [ERROR] 
org.apache.flink.fs.s3presto.PrestoS3FileSystemITCase  Time elapsed: 17.431 s  
<<< FAILURE!
2020-03-24T01:51:50.5521841Z java.lang.AssertionError: expected: but 
was:
2020-03-24T01:51:50.5522437Z 
2020-03-24T01:51:50.8966641Z [INFO] 
2020-03-24T01:51:50.8967386Z [INFO] Results:
2020-03-24T01:51:50.8967849Z [INFO] 
2020-03-24T01:51:50.8968357Z [ERROR] Failures: 
2020-03-24T01:51:50.8970933Z [ERROR]   
PrestoS3FileSystemITCase>AbstractHadoopFileSystemITTest.teardown:155->AbstractHadoopFileSystemITTest.checkPathExistence:61
 expected: but was:
2020-03-24T01:51:50.8972311Z [ERROR] Errors: 
2020-03-24T01:51:50.8973807Z [ERROR]   
PrestoS3FileSystemITCase>AbstractHadoopFileSystemITTest.testSimpleFileWriteAndRead:87
 » UnrecoverableS3Operation
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16712) Refactor StreamTask to construct final fields

2020-03-22 Thread Zhijiang (Jira)
Zhijiang created FLINK-16712:


 Summary: Refactor StreamTask to construct final fields
 Key: FLINK-16712
 URL: https://issues.apache.org/jira/browse/FLINK-16712
 Project: Flink
  Issue Type: Task
  Components: Runtime / Task
Reporter: Zhijiang
Assignee: Zhijiang
 Fix For: 1.11.0


At the moment there are four fields initialized in the method of 
StreamTask#beforeInvoke, such as `stateBackend`, `checkpointStorage`, 
`timerService`, `asyncOperationsThreadPool`.

In general it is suggested to use final fields to get known benefits. So we can 
refactor the StreamTask to initialize these fields in the constructor instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16690) Refactor StreamTaskTest to reuse TestTaskBuilder and MockStreamTaskBuilder

2020-03-20 Thread Zhijiang (Jira)
Zhijiang created FLINK-16690:


 Summary: Refactor StreamTaskTest to reuse TestTaskBuilder and 
MockStreamTaskBuilder
 Key: FLINK-16690
 URL: https://issues.apache.org/jira/browse/FLINK-16690
 Project: Flink
  Issue Type: Task
  Components: Runtime / Task, Tests
Reporter: Zhijiang
 Fix For: 1.11.0


We can reuse existing TestTaskBuilder and MockStreamTaskBuilder for 
constructing Task and StreamTask easily in tests to simplify StreamTaskTest 
case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16653) Introduce ResultPartitionWriterTestBase for simplifying tests

2020-03-18 Thread Zhijiang (Jira)
Zhijiang created FLINK-16653:


 Summary: Introduce ResultPartitionWriterTestBase for simplifying 
tests 
 Key: FLINK-16653
 URL: https://issues.apache.org/jira/browse/FLINK-16653
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network, Tests
Reporter: Zhijiang
Assignee: Zhijiang


At the moment there are at-least four implementations of 
`ResultPartitionWriter` interface used in unit tests. And there are about ten 
methods to be implemented for `ResultPartitionWriter` and most of them are 
dummy in tests.

When we want to extend the methods for `ResultPartitionWriter`, the above four 
dummy implementations in tests have to be adjusted as well, to waste a bit 
efforts.

Therefore abstract ResultPartitionWriterTestBase is proposed to implement the 
basic dummy methods for `ResultPartitionWriter`, and the previous four 
instances can all extend it to only implement one or two methods based on 
specific requirements in tests. And we will probably only need to adjust the 
ResultPartitionWriterTestBase when extending the `ResultPartitionWriter` 
interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   >