Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Matthias Pohl
Congrats, Guowei!

On Wed, Jan 20, 2021 at 8:22 AM Congxian Qiu  wrote:

> Congrats Guowei!
>
> Best,
> Congxian
>
>
> Danny Chan  于2021年1月20日周三 下午2:59写道:
>
> > Congratulations Guowei!
> >
> > Best,
> > Danny
> >
> > Jark Wu  于2021年1月20日周三 下午2:47写道:
> >
> > > Congratulations Guowei!
> > >
> > > Cheers,
> > > Jark
> > >
> > > On Wed, 20 Jan 2021 at 14:36, SHI Xiaogang 
> > wrote:
> > >
> > > > Congratulations MA!
> > > >
> > > > Regards,
> > > > Xiaogang
> > > >
> > > > Yun Tang  于2021年1月20日周三 下午2:24写道:
> > > >
> > > > > Congratulations Guowei!
> > > > >
> > > > > Best
> > > > > Yun Tang
> > > > > 
> > > > > From: Yang Wang 
> > > > > Sent: Wednesday, January 20, 2021 13:59
> > > > > To: dev 
> > > > > Subject: Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink
> > > > > Committer
> > > > >
> > > > > Congratulations Guowei!
> > > > >
> > > > >
> > > > > Best,
> > > > > Yang
> > > > >
> > > > > Yun Gao  于2021年1月20日周三 下午1:52写道:
> > > > >
> > > > > > Congratulations Guowei!
> > > > > >
> > > > > > Best,
> > > > > >
> > > Yun--
> > > > > > Sender:Yangze Guo
> > > > > > Date:2021/01/20 13:48:52
> > > > > > Recipient:dev
> > > > > > Theme:Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink
> > > Committer
> > > > > >
> > > > > > Congratulations, Guowei! Well deserved.
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Yangze Guo
> > > > > >
> > > > > > On Wed, Jan 20, 2021 at 1:46 PM Xintong Song <
> > tonysong...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > Congratulations, Guowei~!
> > > > > > >
> > > > > > >
> > > > > > > Thank you~
> > > > > > >
> > > > > > > Xintong Song
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Jan 20, 2021 at 1:42 PM Yuan Mei <
> yuanmei.w...@gmail.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > Congrats Guowei :-)
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yuan
> > > > > > > >
> > > > > > > > On Wed, Jan 20, 2021 at 1:36 PM tison 
> > > > wrote:
> > > > > > > >
> > > > > > > > > Congrats Guowei!
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > tison.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Kurt Young  于2021年1月20日周三 下午1:34写道:
> > > > > > > > >
> > > > > > > > > > Hi everyone,
> > > > > > > > > >
> > > > > > > > > > I'm very happy to announce that Guowei Ma has accepted
> the
> > > > > > invitation
> > > > > > > > to
> > > > > > > > > > become a Flink committer.
> > > > > > > > > >
> > > > > > > > > > Guowei is a very long term Flink developer, he has been
> > > > extremely
> > > > > > > > helpful
> > > > > > > > > > with
> > > > > > > > > > some important runtime changes, and also been  active
> with
> > > > > > answering
> > > > > > > > user
> > > > > > > > > > questions as well as discussing designs.
> > > > > > > > > >
> > > > > > > > > > Please join me in congratulating Guowei for becoming a
> > Flink
> > > > > > committer!
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Kurt
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >


Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Congxian Qiu
Congrats Guowei!

Best,
Congxian


Danny Chan  于2021年1月20日周三 下午2:59写道:

> Congratulations Guowei!
>
> Best,
> Danny
>
> Jark Wu  于2021年1月20日周三 下午2:47写道:
>
> > Congratulations Guowei!
> >
> > Cheers,
> > Jark
> >
> > On Wed, 20 Jan 2021 at 14:36, SHI Xiaogang 
> wrote:
> >
> > > Congratulations MA!
> > >
> > > Regards,
> > > Xiaogang
> > >
> > > Yun Tang  于2021年1月20日周三 下午2:24写道:
> > >
> > > > Congratulations Guowei!
> > > >
> > > > Best
> > > > Yun Tang
> > > > 
> > > > From: Yang Wang 
> > > > Sent: Wednesday, January 20, 2021 13:59
> > > > To: dev 
> > > > Subject: Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink
> > > > Committer
> > > >
> > > > Congratulations Guowei!
> > > >
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > Yun Gao  于2021年1月20日周三 下午1:52写道:
> > > >
> > > > > Congratulations Guowei!
> > > > >
> > > > > Best,
> > > > >
> > Yun--
> > > > > Sender:Yangze Guo
> > > > > Date:2021/01/20 13:48:52
> > > > > Recipient:dev
> > > > > Theme:Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink
> > Committer
> > > > >
> > > > > Congratulations, Guowei! Well deserved.
> > > > >
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Wed, Jan 20, 2021 at 1:46 PM Xintong Song <
> tonysong...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > Congratulations, Guowei~!
> > > > > >
> > > > > >
> > > > > > Thank you~
> > > > > >
> > > > > > Xintong Song
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Jan 20, 2021 at 1:42 PM Yuan Mei  >
> > > > wrote:
> > > > > >
> > > > > > > Congrats Guowei :-)
> > > > > > >
> > > > > > > Best,
> > > > > > > Yuan
> > > > > > >
> > > > > > > On Wed, Jan 20, 2021 at 1:36 PM tison 
> > > wrote:
> > > > > > >
> > > > > > > > Congrats Guowei!
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > tison.
> > > > > > > >
> > > > > > > >
> > > > > > > > Kurt Young  于2021年1月20日周三 下午1:34写道:
> > > > > > > >
> > > > > > > > > Hi everyone,
> > > > > > > > >
> > > > > > > > > I'm very happy to announce that Guowei Ma has accepted the
> > > > > invitation
> > > > > > > to
> > > > > > > > > become a Flink committer.
> > > > > > > > >
> > > > > > > > > Guowei is a very long term Flink developer, he has been
> > > extremely
> > > > > > > helpful
> > > > > > > > > with
> > > > > > > > > some important runtime changes, and also been  active with
> > > > > answering
> > > > > > > user
> > > > > > > > > questions as well as discussing designs.
> > > > > > > > >
> > > > > > > > > Please join me in congratulating Guowei for becoming a
> Flink
> > > > > committer!
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Kurt
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Arvid Heise
Congratulations!

On Wed, Jan 20, 2021 at 7:53 AM Danny Chan  wrote:

> Congratulations Guowei!
>
> Best,
> Danny
>
> Jark Wu  于2021年1月20日周三 下午2:47写道:
>
> > Congratulations Guowei!
> >
> > Cheers,
> > Jark
> >
> > On Wed, 20 Jan 2021 at 14:36, SHI Xiaogang 
> wrote:
> >
> > > Congratulations MA!
> > >
> > > Regards,
> > > Xiaogang
> > >
> > > Yun Tang  于2021年1月20日周三 下午2:24写道:
> > >
> > > > Congratulations Guowei!
> > > >
> > > > Best
> > > > Yun Tang
> > > > 
> > > > From: Yang Wang 
> > > > Sent: Wednesday, January 20, 2021 13:59
> > > > To: dev 
> > > > Subject: Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink
> > > > Committer
> > > >
> > > > Congratulations Guowei!
> > > >
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > Yun Gao  于2021年1月20日周三 下午1:52写道:
> > > >
> > > > > Congratulations Guowei!
> > > > >
> > > > > Best,
> > > > >
> > Yun--
> > > > > Sender:Yangze Guo
> > > > > Date:2021/01/20 13:48:52
> > > > > Recipient:dev
> > > > > Theme:Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink
> > Committer
> > > > >
> > > > > Congratulations, Guowei! Well deserved.
> > > > >
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Wed, Jan 20, 2021 at 1:46 PM Xintong Song <
> tonysong...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > Congratulations, Guowei~!
> > > > > >
> > > > > >
> > > > > > Thank you~
> > > > > >
> > > > > > Xintong Song
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Jan 20, 2021 at 1:42 PM Yuan Mei  >
> > > > wrote:
> > > > > >
> > > > > > > Congrats Guowei :-)
> > > > > > >
> > > > > > > Best,
> > > > > > > Yuan
> > > > > > >
> > > > > > > On Wed, Jan 20, 2021 at 1:36 PM tison 
> > > wrote:
> > > > > > >
> > > > > > > > Congrats Guowei!
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > tison.
> > > > > > > >
> > > > > > > >
> > > > > > > > Kurt Young  于2021年1月20日周三 下午1:34写道:
> > > > > > > >
> > > > > > > > > Hi everyone,
> > > > > > > > >
> > > > > > > > > I'm very happy to announce that Guowei Ma has accepted the
> > > > > invitation
> > > > > > > to
> > > > > > > > > become a Flink committer.
> > > > > > > > >
> > > > > > > > > Guowei is a very long term Flink developer, he has been
> > > extremely
> > > > > > > helpful
> > > > > > > > > with
> > > > > > > > > some important runtime changes, and also been  active with
> > > > > answering
> > > > > > > user
> > > > > > > > > questions as well as discussing designs.
> > > > > > > > >
> > > > > > > > > Please join me in congratulating Guowei for becoming a
> Flink
> > > > > committer!
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Kurt
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
>


-- 

Arvid Heise | Senior Java Developer



Follow us @VervericaData

--

Join Flink Forward  - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Toni) Cheng


Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Danny Chan
Congratulations Guowei!

Best,
Danny

Jark Wu  于2021年1月20日周三 下午2:47写道:

> Congratulations Guowei!
>
> Cheers,
> Jark
>
> On Wed, 20 Jan 2021 at 14:36, SHI Xiaogang  wrote:
>
> > Congratulations MA!
> >
> > Regards,
> > Xiaogang
> >
> > Yun Tang  于2021年1月20日周三 下午2:24写道:
> >
> > > Congratulations Guowei!
> > >
> > > Best
> > > Yun Tang
> > > 
> > > From: Yang Wang 
> > > Sent: Wednesday, January 20, 2021 13:59
> > > To: dev 
> > > Subject: Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink
> > > Committer
> > >
> > > Congratulations Guowei!
> > >
> > >
> > > Best,
> > > Yang
> > >
> > > Yun Gao  于2021年1月20日周三 下午1:52写道:
> > >
> > > > Congratulations Guowei!
> > > >
> > > > Best,
> > > >
> Yun--
> > > > Sender:Yangze Guo
> > > > Date:2021/01/20 13:48:52
> > > > Recipient:dev
> > > > Theme:Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink
> Committer
> > > >
> > > > Congratulations, Guowei! Well deserved.
> > > >
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Wed, Jan 20, 2021 at 1:46 PM Xintong Song 
> > > > wrote:
> > > > >
> > > > > Congratulations, Guowei~!
> > > > >
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Jan 20, 2021 at 1:42 PM Yuan Mei 
> > > wrote:
> > > > >
> > > > > > Congrats Guowei :-)
> > > > > >
> > > > > > Best,
> > > > > > Yuan
> > > > > >
> > > > > > On Wed, Jan 20, 2021 at 1:36 PM tison 
> > wrote:
> > > > > >
> > > > > > > Congrats Guowei!
> > > > > > >
> > > > > > > Best,
> > > > > > > tison.
> > > > > > >
> > > > > > >
> > > > > > > Kurt Young  于2021年1月20日周三 下午1:34写道:
> > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > I'm very happy to announce that Guowei Ma has accepted the
> > > > invitation
> > > > > > to
> > > > > > > > become a Flink committer.
> > > > > > > >
> > > > > > > > Guowei is a very long term Flink developer, he has been
> > extremely
> > > > > > helpful
> > > > > > > > with
> > > > > > > > some important runtime changes, and also been  active with
> > > > answering
> > > > > > user
> > > > > > > > questions as well as discussing designs.
> > > > > > > >
> > > > > > > > Please join me in congratulating Guowei for becoming a Flink
> > > > committer!
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Kurt
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> >
>


[jira] [Created] (FLINK-21048) Refactor documentation related to switch memory allocator

2021-01-19 Thread Yun Tang (Jira)
Yun Tang created FLINK-21048:


 Summary: Refactor documentation related to switch memory allocator 
 Key: FLINK-21048
 URL: https://issues.apache.org/jira/browse/FLINK-21048
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.13.0, 1.12.2
Reporter: Yun Tang
Assignee: Yun Tang


Since we decide to change the switch of memory allocator from command to 
environment variable in FLINK-21034, we should also change related 
documentation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Jark Wu
Congratulations Guowei!

Cheers,
Jark

On Wed, 20 Jan 2021 at 14:36, SHI Xiaogang  wrote:

> Congratulations MA!
>
> Regards,
> Xiaogang
>
> Yun Tang  于2021年1月20日周三 下午2:24写道:
>
> > Congratulations Guowei!
> >
> > Best
> > Yun Tang
> > 
> > From: Yang Wang 
> > Sent: Wednesday, January 20, 2021 13:59
> > To: dev 
> > Subject: Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink
> > Committer
> >
> > Congratulations Guowei!
> >
> >
> > Best,
> > Yang
> >
> > Yun Gao  于2021年1月20日周三 下午1:52写道:
> >
> > > Congratulations Guowei!
> > >
> > > Best,
> > >  Yun--
> > > Sender:Yangze Guo
> > > Date:2021/01/20 13:48:52
> > > Recipient:dev
> > > Theme:Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer
> > >
> > > Congratulations, Guowei! Well deserved.
> > >
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Wed, Jan 20, 2021 at 1:46 PM Xintong Song 
> > > wrote:
> > > >
> > > > Congratulations, Guowei~!
> > > >
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Wed, Jan 20, 2021 at 1:42 PM Yuan Mei 
> > wrote:
> > > >
> > > > > Congrats Guowei :-)
> > > > >
> > > > > Best,
> > > > > Yuan
> > > > >
> > > > > On Wed, Jan 20, 2021 at 1:36 PM tison 
> wrote:
> > > > >
> > > > > > Congrats Guowei!
> > > > > >
> > > > > > Best,
> > > > > > tison.
> > > > > >
> > > > > >
> > > > > > Kurt Young  于2021年1月20日周三 下午1:34写道:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I'm very happy to announce that Guowei Ma has accepted the
> > > invitation
> > > > > to
> > > > > > > become a Flink committer.
> > > > > > >
> > > > > > > Guowei is a very long term Flink developer, he has been
> extremely
> > > > > helpful
> > > > > > > with
> > > > > > > some important runtime changes, and also been  active with
> > > answering
> > > > > user
> > > > > > > questions as well as discussing designs.
> > > > > > >
> > > > > > > Please join me in congratulating Guowei for becoming a Flink
> > > committer!
> > > > > > >
> > > > > > > Best,
> > > > > > > Kurt
> > > > > > >
> > > > > >
> > > > >
> > >
> >
>


[jira] [Created] (FLINK-21047) Expose the correct registered/free resources information in SlotManager

2021-01-19 Thread Yangze Guo (Jira)
Yangze Guo created FLINK-21047:
--

 Summary: Expose the correct registered/free resources information 
in SlotManager
 Key: FLINK-21047
 URL: https://issues.apache.org/jira/browse/FLINK-21047
 Project: Flink
  Issue Type: Improvement
Reporter: Yangze Guo


In FLINK-16640, we extend ResourceOverview and TaskManager(Details)Info for 
registered/free resources. However, the implementation is based on the 
assumption that all the registered task executors are homogeneous with the 
default resource spec. This assumption will be broken in standalone mode when 
user manually starts heterogeneous task executors.

We need to calculate the registered/free resources according to the exact 
{{defaultSlotResourceProfile}} and {{totalResourceProfile}} of 
\{{TaskExecutorRegistration}}s.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread SHI Xiaogang
Congratulations MA!

Regards,
Xiaogang

Yun Tang  于2021年1月20日周三 下午2:24写道:

> Congratulations Guowei!
>
> Best
> Yun Tang
> 
> From: Yang Wang 
> Sent: Wednesday, January 20, 2021 13:59
> To: dev 
> Subject: Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink
> Committer
>
> Congratulations Guowei!
>
>
> Best,
> Yang
>
> Yun Gao  于2021年1月20日周三 下午1:52写道:
>
> > Congratulations Guowei!
> >
> > Best,
> >  Yun--
> > Sender:Yangze Guo
> > Date:2021/01/20 13:48:52
> > Recipient:dev
> > Theme:Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer
> >
> > Congratulations, Guowei! Well deserved.
> >
> >
> > Best,
> > Yangze Guo
> >
> > On Wed, Jan 20, 2021 at 1:46 PM Xintong Song 
> > wrote:
> > >
> > > Congratulations, Guowei~!
> > >
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Wed, Jan 20, 2021 at 1:42 PM Yuan Mei 
> wrote:
> > >
> > > > Congrats Guowei :-)
> > > >
> > > > Best,
> > > > Yuan
> > > >
> > > > On Wed, Jan 20, 2021 at 1:36 PM tison  wrote:
> > > >
> > > > > Congrats Guowei!
> > > > >
> > > > > Best,
> > > > > tison.
> > > > >
> > > > >
> > > > > Kurt Young  于2021年1月20日周三 下午1:34写道:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > I'm very happy to announce that Guowei Ma has accepted the
> > invitation
> > > > to
> > > > > > become a Flink committer.
> > > > > >
> > > > > > Guowei is a very long term Flink developer, he has been extremely
> > > > helpful
> > > > > > with
> > > > > > some important runtime changes, and also been  active with
> > answering
> > > > user
> > > > > > questions as well as discussing designs.
> > > > > >
> > > > > > Please join me in congratulating Guowei for becoming a Flink
> > committer!
> > > > > >
> > > > > > Best,
> > > > > > Kurt
> > > > > >
> > > > >
> > > >
> >
>


Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Yun Tang
Congratulations Guowei!

Best
Yun Tang

From: Yang Wang 
Sent: Wednesday, January 20, 2021 13:59
To: dev 
Subject: Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

Congratulations Guowei!


Best,
Yang

Yun Gao  于2021年1月20日周三 下午1:52写道:

> Congratulations Guowei!
>
> Best,
>  Yun--
> Sender:Yangze Guo
> Date:2021/01/20 13:48:52
> Recipient:dev
> Theme:Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer
>
> Congratulations, Guowei! Well deserved.
>
>
> Best,
> Yangze Guo
>
> On Wed, Jan 20, 2021 at 1:46 PM Xintong Song 
> wrote:
> >
> > Congratulations, Guowei~!
> >
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Wed, Jan 20, 2021 at 1:42 PM Yuan Mei  wrote:
> >
> > > Congrats Guowei :-)
> > >
> > > Best,
> > > Yuan
> > >
> > > On Wed, Jan 20, 2021 at 1:36 PM tison  wrote:
> > >
> > > > Congrats Guowei!
> > > >
> > > > Best,
> > > > tison.
> > > >
> > > >
> > > > Kurt Young  于2021年1月20日周三 下午1:34写道:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > I'm very happy to announce that Guowei Ma has accepted the
> invitation
> > > to
> > > > > become a Flink committer.
> > > > >
> > > > > Guowei is a very long term Flink developer, he has been extremely
> > > helpful
> > > > > with
> > > > > some important runtime changes, and also been  active with
> answering
> > > user
> > > > > questions as well as discussing designs.
> > > > >
> > > > > Please join me in congratulating Guowei for becoming a Flink
> committer!
> > > > >
> > > > > Best,
> > > > > Kurt
> > > > >
> > > >
> > >
>


Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Leonard Xu
Congratulations Guowei! 

Best,
Leonard


[jira] [Created] (FLINK-21046) test_map_view and test_map_view_iterate test failed

2021-01-19 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-21046:


 Summary: test_map_view and test_map_view_iterate test failed
 Key: FLINK-21046
 URL: https://issues.apache.org/jira/browse/FLINK-21046
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.13.0
Reporter: Huang Xingbo


[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12257=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3]
{code:java}
2021-01-20T02:25:30.7107189Z E   py4j.protocol.Py4JJavaError: 
An error occurred while calling 
z:org.apache.flink.table.runtime.arrow.ArrowUtils.collectAsPandasDataFrame.
2021-01-20T02:25:30.7108168Z E   : java.lang.RuntimeException: 
Could not remove element '+I[Hi,Hi2,Hi3,Hi_, 1,1,1,2, Hi:2,Hi2:1,Hi3:1,Hi_:1, 
3, hi]', should never happen.
2021-01-20T02:25:30.7108821Z E  at 
org.apache.flink.table.runtime.arrow.ArrowUtils.filterOutRetractRows(ArrowUtils.java:754)
2021-01-20T02:25:30.7109502Z E  at 
org.apache.flink.table.runtime.arrow.ArrowUtils.collectAsPandasDataFrame(ArrowUtils.java:673)
2021-01-20T02:25:30.7110005Z E  at 
sun.reflect.GeneratedMethodAccessor264.invoke(Unknown Source)
2021-01-20T02:25:30.7110481Z E  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2021-01-20T02:25:30.7110983Z E  at 
java.lang.reflect.Method.invoke(Method.java:498)
2021-01-20T02:25:30.7111503Z E  at 
org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
2021-01-20T02:25:30.7112281Z E  at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
2021-01-20T02:25:30.7113163Z E  at 
org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282)
2021-01-20T02:25:30.7114101Z E  at 
org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
2021-01-20T02:25:30.7115014Z E  at 
org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
2021-01-20T02:25:30.7115884Z E  at 
org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
2021-01-20T02:25:30.7116379Z E  at 
java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-19 Thread Yangze Guo
Thanks for the responses, Till and Xintong.

I second Xintong's comment that SSG-based runtime interface will give
us the flexibility to achieve op/task-based approach. That's one of
the most important reasons for our design choice.

Some cents regarding the default operator resource:
- It might be good for the scenario of DataStream jobs.
   ** For light-weight operators, the accumulative configuration error
will not be significant. Then, the resource of a task used is
proportional to the number of operators it contains.
   ** For heavy operators like join and window or operators using the
external resources, user will turn to the fine-grained resource
configuration.
- It can increase the stability for the standalone cluster where task
executors registered are heterogeneous(with different default slot
resources).
- It might not be good for SQL users. The operators that SQL will be
transferred to is a black box to the user. We also do not guarantee
the cross-version of consistency of the transformation so far.

I think it can be treated as a follow-up work when the fine-grained
resource management is end-to-end ready.

Best,
Yangze Guo


On Wed, Jan 20, 2021 at 11:16 AM Xintong Song  wrote:
>
> Thanks for the feedback, Till.
>
> ## I feel that what you proposed (operator-based + default value) might be
> subsumed by the SSG-based approach.
> Thinking of op_1 -> op_2, there are the following 4 cases, categorized by
> whether the resource requirements are known to the users.
>
>1. *Both known.* As previously mentioned, there's no reason to put
>multiple operators whose individual resource requirements are already known
>into the same group in fine-grained resource management. And if op_1 and
>op_2 are in different groups, there should be no problem switching data
>exchange mode from pipelined to blocking. This is equivalent to specifying
>operator resource requirements in your proposal.
>2. *op_1 known, op_2 unknown.* Similar to 1), except that op_2 is in a
>SSG whose resource is not specified thus would have the default slot
>resource. This is equivalent to having default operator resources in your
>proposal.
>3. *Both unknown*. The user can either set op_1 and op_2 to the same SSG
>or separate SSGs.
>   - If op_1 and op_2 are in the same SSG, it will be equivalent to the
>   coarse-grained resource management, where op_1 and op_2 share a default
>   size slot no matter which data exchange mode is used.
>   - If op_1 and op_2 are in different SSGs, then each of them will use
>   a default size slot. This is equivalent to setting them with default
>   operator resources in your proposal.
>4. *Total (pipeline) or max (blocking) of op_1 and op_2 is known.*
>   - It is possible that the user learns the total / max resource
>   requirement from executing and monitoring the job, while not
> being aware of
>   individual operator requirements.
>   - I believe this is the case your proposal does not cover. And TBH,
>   this is probably how most users learn the resource requirements,
> according
>   to my experiences.
>   - In this case, the user might need to specify different resources if
>   he wants to switch the execution mode, which should not be worse than 
> not
>   being able to use fine-grained resource management.
>
>
> ## An additional idea inspired by your proposal.
> We may provide multiple options for deciding resources for SSGs whose
> requirement is not specified, if needed.
>
>- Default slot resource (current design)
>- Default operator resource times number of operators (equivalent to
>your proposal)
>
>
> ## Exposing internal runtime strategies
> Theoretically, yes. Tying to the SSGs, the resource requirements might be
> affected if how SSGs are internally handled changes in future. Practically,
> I do not concretely see at the moment what kind of changes we may want in
> future that might conflict with this FLIP proposal, as the question of
> switching data exchange mode answered above. I'd suggest to not give up the
> user friendliness we may gain now for the future problems that may or may
> not exist.
>
> Moreover, the SSG-based approach has the flexibility to achieve the
> equivalent behavior as the operator-based approach, if we set each operator
> (or task) to a separate SSG. We can even provide a shortcut option to
> automatically do that for users, if needed.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Tue, Jan 19, 2021 at 11:48 PM Till Rohrmann  wrote:
>
> > Thanks for the responses Xintong and Stephan,
> >
> > I agree that being able to define the resource requirements for a group of
> > operators is more user friendly. However, my concern is that we are
> > exposing thereby internal runtime strategies which might limit our
> > flexibility to execute a given job. Moreover, the semantics of configuring
> > resource requirements for SSGs could break if switching 

[jira] [Created] (FLINK-21045) Support 'load module' and 'unload module' SQL syntax

2021-01-19 Thread Nicholas Jiang (Jira)
Nicholas Jiang created FLINK-21045:
--

 Summary: Support 'load module' and 'unload module' SQL syntax
 Key: FLINK-21045
 URL: https://issues.apache.org/jira/browse/FLINK-21045
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.13.0
Reporter: Nicholas Jiang


At present, Flink SQL doesn't support the 'load module' and 'unload module' SQL 
syntax. It's necessary for uses in the situation that users load and unload 
user-defined module through table api or sql client.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Yang Wang
Congratulations Guowei!


Best,
Yang

Yun Gao  于2021年1月20日周三 下午1:52写道:

> Congratulations Guowei!
>
> Best,
>  Yun--
> Sender:Yangze Guo
> Date:2021/01/20 13:48:52
> Recipient:dev
> Theme:Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer
>
> Congratulations, Guowei! Well deserved.
>
>
> Best,
> Yangze Guo
>
> On Wed, Jan 20, 2021 at 1:46 PM Xintong Song 
> wrote:
> >
> > Congratulations, Guowei~!
> >
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Wed, Jan 20, 2021 at 1:42 PM Yuan Mei  wrote:
> >
> > > Congrats Guowei :-)
> > >
> > > Best,
> > > Yuan
> > >
> > > On Wed, Jan 20, 2021 at 1:36 PM tison  wrote:
> > >
> > > > Congrats Guowei!
> > > >
> > > > Best,
> > > > tison.
> > > >
> > > >
> > > > Kurt Young  于2021年1月20日周三 下午1:34写道:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > I'm very happy to announce that Guowei Ma has accepted the
> invitation
> > > to
> > > > > become a Flink committer.
> > > > >
> > > > > Guowei is a very long term Flink developer, he has been extremely
> > > helpful
> > > > > with
> > > > > some important runtime changes, and also been  active with
> answering
> > > user
> > > > > questions as well as discussing designs.
> > > > >
> > > > > Please join me in congratulating Guowei for becoming a Flink
> committer!
> > > > >
> > > > > Best,
> > > > > Kurt
> > > > >
> > > >
> > >
>


Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Yun Gao
Congratulations Guowei!

Best,
 Yun--
Sender:Yangze Guo
Date:2021/01/20 13:48:52
Recipient:dev
Theme:Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

Congratulations, Guowei! Well deserved.


Best,
Yangze Guo

On Wed, Jan 20, 2021 at 1:46 PM Xintong Song  wrote:
>
> Congratulations, Guowei~!
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Wed, Jan 20, 2021 at 1:42 PM Yuan Mei  wrote:
>
> > Congrats Guowei :-)
> >
> > Best,
> > Yuan
> >
> > On Wed, Jan 20, 2021 at 1:36 PM tison  wrote:
> >
> > > Congrats Guowei!
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Kurt Young  于2021年1月20日周三 下午1:34写道:
> > >
> > > > Hi everyone,
> > > >
> > > > I'm very happy to announce that Guowei Ma has accepted the invitation
> > to
> > > > become a Flink committer.
> > > >
> > > > Guowei is a very long term Flink developer, he has been extremely
> > helpful
> > > > with
> > > > some important runtime changes, and also been  active with answering
> > user
> > > > questions as well as discussing designs.
> > > >
> > > > Please join me in congratulating Guowei for becoming a Flink committer!
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > >
> >


Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Yangze Guo
Congratulations, Guowei! Well deserved.


Best,
Yangze Guo

On Wed, Jan 20, 2021 at 1:46 PM Xintong Song  wrote:
>
> Congratulations, Guowei~!
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Wed, Jan 20, 2021 at 1:42 PM Yuan Mei  wrote:
>
> > Congrats Guowei :-)
> >
> > Best,
> > Yuan
> >
> > On Wed, Jan 20, 2021 at 1:36 PM tison  wrote:
> >
> > > Congrats Guowei!
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Kurt Young  于2021年1月20日周三 下午1:34写道:
> > >
> > > > Hi everyone,
> > > >
> > > > I'm very happy to announce that Guowei Ma has accepted the invitation
> > to
> > > > become a Flink committer.
> > > >
> > > > Guowei is a very long term Flink developer, he has been extremely
> > helpful
> > > > with
> > > > some important runtime changes, and also been  active with answering
> > user
> > > > questions as well as discussing designs.
> > > >
> > > > Please join me in congratulating Guowei for becoming a Flink committer!
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > >
> >


Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Yu Li
Congratulations and welcome, Guowei!

Best Regards,
Yu


On Wed, 20 Jan 2021 at 13:46, Xintong Song  wrote:

> Congratulations, Guowei~!
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Wed, Jan 20, 2021 at 1:42 PM Yuan Mei  wrote:
>
> > Congrats Guowei :-)
> >
> > Best,
> > Yuan
> >
> > On Wed, Jan 20, 2021 at 1:36 PM tison  wrote:
> >
> > > Congrats Guowei!
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Kurt Young  于2021年1月20日周三 下午1:34写道:
> > >
> > > > Hi everyone,
> > > >
> > > > I'm very happy to announce that Guowei Ma has accepted the invitation
> > to
> > > > become a Flink committer.
> > > >
> > > > Guowei is a very long term Flink developer, he has been extremely
> > helpful
> > > > with
> > > > some important runtime changes, and also been  active with answering
> > user
> > > > questions as well as discussing designs.
> > > >
> > > > Please join me in congratulating Guowei for becoming a Flink
> committer!
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > >
> >
>


Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Xintong Song
Congratulations, Guowei~!


Thank you~

Xintong Song



On Wed, Jan 20, 2021 at 1:42 PM Yuan Mei  wrote:

> Congrats Guowei :-)
>
> Best,
> Yuan
>
> On Wed, Jan 20, 2021 at 1:36 PM tison  wrote:
>
> > Congrats Guowei!
> >
> > Best,
> > tison.
> >
> >
> > Kurt Young  于2021年1月20日周三 下午1:34写道:
> >
> > > Hi everyone,
> > >
> > > I'm very happy to announce that Guowei Ma has accepted the invitation
> to
> > > become a Flink committer.
> > >
> > > Guowei is a very long term Flink developer, he has been extremely
> helpful
> > > with
> > > some important runtime changes, and also been  active with answering
> user
> > > questions as well as discussing designs.
> > >
> > > Please join me in congratulating Guowei for becoming a Flink committer!
> > >
> > > Best,
> > > Kurt
> > >
> >
>


Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Dian Fu
Congratulations Guowei! Well deserved!

Regards,
Dian

> 在 2021年1月20日,下午1:42,Yuan Mei  写道:
> 
> Congrats Guowei :-)
> 
> Best,
> Yuan
> 
> On Wed, Jan 20, 2021 at 1:36 PM tison  wrote:
> 
>> Congrats Guowei!
>> 
>> Best,
>> tison.
>> 
>> 
>> Kurt Young  于2021年1月20日周三 下午1:34写道:
>> 
>>> Hi everyone,
>>> 
>>> I'm very happy to announce that Guowei Ma has accepted the invitation to
>>> become a Flink committer.
>>> 
>>> Guowei is a very long term Flink developer, he has been extremely helpful
>>> with
>>> some important runtime changes, and also been  active with answering user
>>> questions as well as discussing designs.
>>> 
>>> Please join me in congratulating Guowei for becoming a Flink committer!
>>> 
>>> Best,
>>> Kurt
>>> 
>> 



Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Yuan Mei
Congrats Guowei :-)

Best,
Yuan

On Wed, Jan 20, 2021 at 1:36 PM tison  wrote:

> Congrats Guowei!
>
> Best,
> tison.
>
>
> Kurt Young  于2021年1月20日周三 下午1:34写道:
>
> > Hi everyone,
> >
> > I'm very happy to announce that Guowei Ma has accepted the invitation to
> > become a Flink committer.
> >
> > Guowei is a very long term Flink developer, he has been extremely helpful
> > with
> > some important runtime changes, and also been  active with answering user
> > questions as well as discussing designs.
> >
> > Please join me in congratulating Guowei for becoming a Flink committer!
> >
> > Best,
> > Kurt
> >
>


Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread tison
Congrats Guowei!

Best,
tison.


Kurt Young  于2021年1月20日周三 下午1:34写道:

> Hi everyone,
>
> I'm very happy to announce that Guowei Ma has accepted the invitation to
> become a Flink committer.
>
> Guowei is a very long term Flink developer, he has been extremely helpful
> with
> some important runtime changes, and also been  active with answering user
> questions as well as discussing designs.
>
> Please join me in congratulating Guowei for becoming a Flink committer!
>
> Best,
> Kurt
>


[ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-19 Thread Kurt Young
Hi everyone,

I'm very happy to announce that Guowei Ma has accepted the invitation to
become a Flink committer.

Guowei is a very long term Flink developer, he has been extremely helpful
with
some important runtime changes, and also been  active with answering user
questions as well as discussing designs.

Please join me in congratulating Guowei for becoming a Flink committer!

Best,
Kurt


[jira] [Created] (FLINK-21043) SemanticXidGeneratorTest.testXidsUniqueAmongGenerators test failed

2021-01-19 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-21043:


 Summary: SemanticXidGeneratorTest.testXidsUniqueAmongGenerators 
test failed
 Key: FLINK-21043
 URL: https://issues.apache.org/jira/browse/FLINK-21043
 Project: Flink
  Issue Type: Bug
  Components: Connectors / JDBC
Affects Versions: 1.13.0
Reporter: Huang Xingbo


[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12245=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=03dca39c-73e8-5aaf-601d-328ae5c35f20]
{code:java}
2021-01-19T15:21:35.5665063Z [ERROR] Tests run: 2, Failures: 1, Errors: 0, 
Skipped: 0, Time elapsed: 0.1 s <<< FAILURE! - in 
org.apache.flink.connector.jdbc.xa.SemanticXidGeneratorTest
2021-01-19T15:21:35.5665936Z [ERROR] 
testXidsUniqueAmongGenerators(org.apache.flink.connector.jdbc.xa.SemanticXidGeneratorTest)
  Time elapsed: 0.024 s  <<< FAILURE!
2021-01-19T15:21:35.5666770Z junit.framework.AssertionFailedError: 
expected:<1> but was:<>
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21044) SemanticXidGeneratorTest.testXidsUniqueAmongGenerators test failed

2021-01-19 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-21044:


 Summary: SemanticXidGeneratorTest.testXidsUniqueAmongGenerators 
test failed
 Key: FLINK-21044
 URL: https://issues.apache.org/jira/browse/FLINK-21044
 Project: Flink
  Issue Type: Bug
  Components: Connectors / JDBC
Affects Versions: 1.13.0
Reporter: Huang Xingbo


[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12245=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=03dca39c-73e8-5aaf-601d-328ae5c35f20]
{code:java}
2021-01-19T15:21:35.5665063Z [ERROR] Tests run: 2, Failures: 1, Errors: 0, 
Skipped: 0, Time elapsed: 0.1 s <<< FAILURE! - in 
org.apache.flink.connector.jdbc.xa.SemanticXidGeneratorTest
2021-01-19T15:21:35.5665936Z [ERROR] 
testXidsUniqueAmongGenerators(org.apache.flink.connector.jdbc.xa.SemanticXidGeneratorTest)
  Time elapsed: 0.024 s  <<< FAILURE!
2021-01-19T15:21:35.5666770Z junit.framework.AssertionFailedError: 
expected:<1> but was:<>
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21042) Correct the error in the code example in page 'aggregate-functions'.

2021-01-19 Thread Roc Marshal (Jira)
Roc Marshal created FLINK-21042:
---

 Summary: Correct the error in the code example in page 
'aggregate-functions'.
 Key: FLINK-21042
 URL: https://issues.apache.org/jira/browse/FLINK-21042
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.12.1, 1.13.0
Reporter: Roc Marshal


The markdown file location: flink/docs/dev/table/functions/udfs.zh.md & 
flink/docs/dev/table/functions/udfs.md 

The target url : 
[https://ci.apache.org/projects/flink/flink-docs-master/dev/table/functions/udfs.html#aggregate-functions]

 

The original content:

*SELECT myField, WeightedAvg(value, weight) FROM MyTable GROUP BY myField*

 

The 'value' is a reserved keyword of the flink table module,So the correct 
usage should to execute in back quotes .

_*SELECT myField, WeightedAvg(`value`, weight) FROM MyTable GROUP BY myField*_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21041) Introduce ExecNodeGraph to wrap the ExecNode topology

2021-01-19 Thread godfrey he (Jira)
godfrey he created FLINK-21041:
--

 Summary: Introduce ExecNodeGraph to wrap the ExecNode topology
 Key: FLINK-21041
 URL: https://issues.apache.org/jira/browse/FLINK-21041
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: godfrey he
 Fix For: 1.13.0


Currently, we use {{List}} to represent the {{ExecNode}} topology, 
as we will introduce more features (such as serialize/deserialize 
{{ExecNode}}s), It's better we can introduce an unified class to represent the 
topology.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-19 Thread Xintong Song
Thanks for the feedback, Till.

## I feel that what you proposed (operator-based + default value) might be
subsumed by the SSG-based approach.
Thinking of op_1 -> op_2, there are the following 4 cases, categorized by
whether the resource requirements are known to the users.

   1. *Both known.* As previously mentioned, there's no reason to put
   multiple operators whose individual resource requirements are already known
   into the same group in fine-grained resource management. And if op_1 and
   op_2 are in different groups, there should be no problem switching data
   exchange mode from pipelined to blocking. This is equivalent to specifying
   operator resource requirements in your proposal.
   2. *op_1 known, op_2 unknown.* Similar to 1), except that op_2 is in a
   SSG whose resource is not specified thus would have the default slot
   resource. This is equivalent to having default operator resources in your
   proposal.
   3. *Both unknown*. The user can either set op_1 and op_2 to the same SSG
   or separate SSGs.
  - If op_1 and op_2 are in the same SSG, it will be equivalent to the
  coarse-grained resource management, where op_1 and op_2 share a default
  size slot no matter which data exchange mode is used.
  - If op_1 and op_2 are in different SSGs, then each of them will use
  a default size slot. This is equivalent to setting them with default
  operator resources in your proposal.
   4. *Total (pipeline) or max (blocking) of op_1 and op_2 is known.*
  - It is possible that the user learns the total / max resource
  requirement from executing and monitoring the job, while not
being aware of
  individual operator requirements.
  - I believe this is the case your proposal does not cover. And TBH,
  this is probably how most users learn the resource requirements,
according
  to my experiences.
  - In this case, the user might need to specify different resources if
  he wants to switch the execution mode, which should not be worse than not
  being able to use fine-grained resource management.


## An additional idea inspired by your proposal.
We may provide multiple options for deciding resources for SSGs whose
requirement is not specified, if needed.

   - Default slot resource (current design)
   - Default operator resource times number of operators (equivalent to
   your proposal)


## Exposing internal runtime strategies
Theoretically, yes. Tying to the SSGs, the resource requirements might be
affected if how SSGs are internally handled changes in future. Practically,
I do not concretely see at the moment what kind of changes we may want in
future that might conflict with this FLIP proposal, as the question of
switching data exchange mode answered above. I'd suggest to not give up the
user friendliness we may gain now for the future problems that may or may
not exist.

Moreover, the SSG-based approach has the flexibility to achieve the
equivalent behavior as the operator-based approach, if we set each operator
(or task) to a separate SSG. We can even provide a shortcut option to
automatically do that for users, if needed.


Thank you~

Xintong Song



On Tue, Jan 19, 2021 at 11:48 PM Till Rohrmann  wrote:

> Thanks for the responses Xintong and Stephan,
>
> I agree that being able to define the resource requirements for a group of
> operators is more user friendly. However, my concern is that we are
> exposing thereby internal runtime strategies which might limit our
> flexibility to execute a given job. Moreover, the semantics of configuring
> resource requirements for SSGs could break if switching from streaming to
> batch execution. If one defines the resource requirements for op_1 -> op_2
> which run in pipelined mode when using the streaming execution, then how do
> we interpret these requirements when op_1 -> op_2 are executed with a
> blocking data exchange in batch execution mode? Consequently, I am still
> leaning towards Stephan's proposal to set the resource requirements per
> operator.
>
> Maybe the following proposal makes the configuration easier: If the user
> wants to use fine-grained resource requirements, then she needs to specify
> the default size which is used for operators which have no explicit
> resource annotation. If this holds true, then every operator would have a
> resource requirement and the system can try to execute the operators in the
> best possible manner w/o being constrained by how the user set the SSG
> requirements.
>
> Cheers,
> Till
>
> On Tue, Jan 19, 2021 at 9:09 AM Xintong Song 
> wrote:
>
> > Thanks for the feedback, Stephan.
> >
> > Actually, your proposal has also come to my mind at some point. And I
> have
> > some concerns about it.
> >
> >
> > 1. It does not give users the same control as the SSG-based approach.
> >
> >
> > While both approaches do not require specifying for each operator,
> > SSG-based approach supports the semantic that "some operators together
> use
> > this 

[jira] [Created] (FLINK-21040) The exception of "Unknown consumerTag" may throw when RMQSource close() is called.

2021-01-19 Thread pcalpha (Jira)
pcalpha created FLINK-21040:
---

 Summary: The exception of "Unknown consumerTag" may throw when 
RMQSource close() is called.
 Key: FLINK-21040
 URL: https://issues.apache.org/jira/browse/FLINK-21040
 Project: Flink
  Issue Type: Bug
  Components: Connectors/ RabbitMQ
Affects Versions: 1.12.1, 1.11.3, 1.11.2, 1.11.1, 1.12.0, 1.10.2, 1.11.0, 
1.10.1, 1.10.0, 1.9.3, 1.9.2, 1.9.1, 1.9.0, 1.8.3, 1.8.2, 1.8.1, 1.8.0, 1.7.2, 
1.6.4, 1.6.3
Reporter: pcalpha


The log is below

2020-10-27 21:00:57  [ Source: rabbitmqSource -> Flat Map (1/1):292311427 ] - [ 
ERROR ]  Error during disposal of stream operator.2020-10-27 21:00:57  [ 
Source: rabbitmqSource -> Flat Map (1/1):292311427 ] - [ ERROR ]  Error during 
disposal of stream operator.java.lang.RuntimeException: Error while cancelling 
RMQ consumer on xxxQueueName at 10.xx.xx.102 at 
org.apache.flink.streaming.connectors.rabbitmq.RMQSource.close(RMQSource.java:196)
 at 
org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43)
 at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:117)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:703)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:635)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:542) 
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) at 
org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) at 
java.base/java.lang.Thread.run(Thread.java:834)Caused by: java.io.IOException: 
Unknown consumerTag at 
com.rabbitmq.client.impl.ChannelN.basicCancel(ChannelN.java:1285) at 
com.rabbitmq.client.impl.recovery.AutorecoveringChannel.basicCancel(AutorecoveringChannel.java:482)
 at 
org.apache.flink.streaming.connectors.rabbitmq.RMQSource.close(RMQSource.java:192)
 ... 8 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21039) Broken links in "dev/table/legacy_planner.zh.md"

2021-01-19 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-21039:


 Summary: Broken links in "dev/table/legacy_planner.zh.md"
 Key: FLINK-21039
 URL: https://issues.apache.org/jira/browse/FLINK-21039
 Project: Flink
  Issue Type: Bug
  Components: Documentation, Table SQL / Ecosystem
Affects Versions: 1.13.0
Reporter: Huang Xingbo


dev/table/legacy_planner.zh.md contains some Chinese links causing errors.
{code:java}
Liquid Exception: Could not find document 'dev/batch/index.md' in tag 'link'. 
Make sure the document exists and the path is correct. in 
dev/table/legacy_planner.zh.md
Could not find document 'dev/batch/index.md' in tag 'link'.
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [Announce] SQL docs are now Blink only

2021-01-19 Thread Leonard Xu
Thanks Seth for the Great job ! 
As the Blink planner has been the default planner and legacy planner is on the 
way to remove, it makes our document clearer.


Best,
Leonard

> 在 2021年1月19日,23:48,Till Rohrmann  写道:
> 
> Awesome. Thanks a lot Seth! This will help us to keep the docs more easily
> up to date.
> 
> Cheers,
> Till
> 
> On Tue, Jan 19, 2021 at 4:30 PM Seth Wiesman  wrote:
> 
>> Hi Everyone,
>> 
>> I just merged in a PR to make the SQL / Table docs Blink planner only.
>> Going forward, you do not need to mark something as Blink only or explain
>> divergent semantics. Simply write the docs as if Blink were the only
>> planner.
>> 
>> There is a Legacy planner specific page[1]. If you add a feature that is
>> not supported by the legacy planner, simply add a note to the legacy page
>> stating as such.
>> 
>> Seth
>> 
>> [1]
>> 
>> https://github.com/apache/flink/blob/master/docs/dev/table/legacy_planner.md
>> 



[jira] [Created] (FLINK-21038) jobmanager.sh may misinterpret webui-port argument

2021-01-19 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-21038:


 Summary: jobmanager.sh may misinterpret webui-port argument
 Key: FLINK-21038
 URL: https://issues.apache.org/jira/browse/FLINK-21038
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Scripts
Affects Versions: 1.9.0
Reporter: Chesnay Schepler


The usage description for {{jobmanager.sh}} is as follows:
{code}
Usage: jobmanager.sh ((start|start-foreground) [host] 
[webui-port])|stop|stop-all
{code}
It shows both {{host}} and {{webui-port}} as _separate_ optional arguments, 
however we hard-code the positions of these parameters: 
{code}
STARTSTOP=$1
HOST=$2 # optional when starting multiple instances
WEBUIPORT=$3 # optional when starting multiple instances
{code}
As such you cannot set the webui.port without also setting some dummy host 
argument.

I'm wondering whether we could not remove these in favor of dynamic properties.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21037) Deduplicate configuration logic in docker entrypoint

2021-01-19 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-21037:


 Summary: Deduplicate configuration logic in docker entrypoint
 Key: FLINK-21037
 URL: https://issues.apache.org/jira/browse/FLINK-21037
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Scripts
Affects Versions: 1.12.2
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.13.0


The various branches in the {{docker-entrypoint.sh}} set various configuration 
properties and pipe {{FLINK_PROPERTIES}} into the configuration.
We don't need to make these calls separately  for each branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21036) Consider removing automatic configuration fo number of slots from docker

2021-01-19 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-21036:


 Summary: Consider removing automatic configuration fo number of 
slots from docker
 Key: FLINK-21036
 URL: https://issues.apache.org/jira/browse/FLINK-21036
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Scripts
Reporter: Chesnay Schepler
 Fix For: 1.13.0


The {{docker-entrypoint.sh}} supports setting the number of task slots via the 
{{TASK_MANAGER_NUMBER_OF_TASK_SLOTS}} environment variable, which defaults to 
the number of cpu cores via {{$(grep -c ^processor /proc/cpuinfo)}}.

The environment variable itself is redundant nowadays since we introduced 
{{FLINK_PROPERTIES}}, and is no longer documented.

Defaulting to the number of CPU cores can be considered convenience, but it 
seems odd to have this specific to docker while the distribution defaults to 
{{1}}.
The bigger issue in my mind though is that this creates a configuration 
mismatch between the Job- and TaskManager processes; the ResourceManager 
specifically needs to know how many slots a worker has to make decisions about 
redundancy and allocating resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21035) Deduplicate copy_plugins_if_required calls

2021-01-19 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-21035:


 Summary: Deduplicate copy_plugins_if_required calls
 Key: FLINK-21035
 URL: https://issues.apache.org/jira/browse/FLINK-21035
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Scripts
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.13.0, 1.12.2


Deduplicate {{copy_plugins_if_required}} calls in the docker-entrypoint.sh to 
reduce complexity and prevent accidents where users are locked out of using 
plugins.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21034) Rework jemalloc switch to use an environment variable

2021-01-19 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-21034:


 Summary: Rework jemalloc switch to use an environment variable
 Key: FLINK-21034
 URL: https://issues.apache.org/jira/browse/FLINK-21034
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Scripts
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.13.0, 1.12.2


The docker scripts have a flag for disabling jemalloc, which is currently 
passed as a parameter to the script.
Such things should be done with environment options instead, as they are less 
error-prone (due to us not having to modify the arguments list) and easier to 
use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Flink configuration from environment variables

2021-01-19 Thread Steven Wu
Ingo,

regarding "state.checkpoints.dir: ${CHECKPOINTS_DIR:-path1}", it definitely
can work. but now users need to know that we can use "CHECKPOINTS_DIR" env
var to override "state.checkpoints.dir". That is the inconvenience that I
am trying to avoid. "state.checkpoints.dir" is well documented in the Flink
website. Now, we need to document "CHECKPOINTS_DIR" separately.

Ideally, we want to just let the user define a env var like
"state.checkpoints.dir=some/overriden/path". But it suffers the problem
that dot chars are invalid characters for shell env var names. That is the
dilemma.

That is why we bundled all the non-conformning env var overrides (where
variable names containing non-conforming chars) into a single base64
encoded string (like "NON_CONFORMING_OVERRIDES_BASE64="  in our deployment infrastructure and unpack them in the container
startup. It is hacky. I am hoping that there is a more elegant solution.

If the configuration key is conforming to shell standard (like
S3_ACCESS_KEY), then we don't have a problem. but it will be deviating from
the Flink config naming convention (Java property style, dot separated).

Thanks,
Steven


On Tue, Jan 19, 2021 at 6:56 AM Till Rohrmann  wrote:

> I think a short FLIP would be awesome.
>
> I guess this feature hasn't been implemented yet because it has not been
> implemented yet ;-) I agree that this feature will improve configuration
> ergonomics big time :-)
>
> Cheers,
> Till
>
> On Tue, Jan 19, 2021 at 12:28 PM Ufuk Celebi  wrote:
>
> > Hey all,
> >
> > I think that approach 2 is more idiomatic for container deployments where
> > it can be cumbersome to manually map flink-conf.yaml contents to env vars
> > [1]. The precedence order outlined by Till would also cover Steven's
> > hierarchical overwrite requirement.
> >
> > I'm really excited about this feature as it will make Flink deployments a
> > lot more ergonomic. The implementation seems to be not too complicated
> > (which makes we wonder why we didn't tackle this earlier or whether I'm
> > missing something).
> >
> > I'd also be happy to shepherd this contribution if there is consensus on
> > the need for it and the approach. Does it make sense to formalize this
> > decision a bit with a short FLIP?
> >
> > – Ufuk
> >
> > [1] In Ververica Platform, we support approach 1, because the Flink
> > configuration is part of the specification for a single Deployment and
> it's
> > minimally more convenient to have something like
> >
> > flinkConfiguration:
> >   foo: ${BAR}
> >
> > for us. I don't think this approach would feel natural when manually
> > deploying Flink. There would be a clear migration path for our customers,
> > so I'm not concerned about this too much.
> >
> > On Tue, Jan 19, 2021, at 10:01 AM, Till Rohrmann wrote:
> > > Hi everyone,
> > >
> > > Thanks for starting this discussion Ingo. I think being able to use env
> > > variables to change Flink's configuration will be a very useful
> feature.
> > >
> > > Concerning the two approaches I would be in favour of the second
> approach
> > > ($FLINK_CONFIG_S3_ACCESS_KEY) because it does not require the user to
> > > prepare a special flink-conf.yaml where he inserts env variables for
> > every
> > > config value he wants to configure. Since this is not required with the
> > > second approach, I think it is more general and easier to use. Also,
> the
> > > user does not have to remember a second set of names (env names) which
> he
> > > has to/can set.
> > >
> > > For how to substitute the values, I think it should happen when we load
> > the
> > > Flink configuration. First we read the file and then overwrite values
> > > specified via an env variable or dynamic properties in some defined
> > order.
> > > For env.java.opts and other options which are used for starting the JVM
> > we
> > > might need special handling in the bash scripts.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Tue, Jan 19, 2021 at 9:46 AM Ingo Bürk  wrote:
> > >
> > > > Hi Yang,
> > > >
> > > > 1. As you said I think this doesn't affect Ververica Platform,
> really,
> > so
> > > > I'm more than happy to hear and follow the thoughts of people more
> > > > experienced with Flink than me.
> > > > 2. I wasn't aware of env.java.opts, but that's definitely a candidate
> > where
> > > > a user may want to "escape" it so it doesn't get substituted
> > immediately, I
> > > > agree.
> > > >
> > > >
> > > > Regards
> > > > Ingo
> > > >
> > > > On Tue, Jan 19, 2021 at 4:47 AM Yang Wang 
> > wrote:
> > > >
> > > > > Hi Ingo,
> > > > >
> > > > > Thanks for your response.
> > > > >
> > > > > 1. Not distinguishing JM/TM is reasonable, but what about the
> client
> > > > side.
> > > > > For Yarn/K8s deployment,
> > > > > the local flink-conf.yaml will be shipped to JM/TM. So I am just
> > confused
> > > > > about where should the environment
> > > > > variables be replaced? IIUC, it is not an issue for Ververica
> > Platform
> > > > > since it is always done in the JM/TM side.
> > > > 

Re: [Announce] SQL docs are now Blink only

2021-01-19 Thread Till Rohrmann
Awesome. Thanks a lot Seth! This will help us to keep the docs more easily
up to date.

Cheers,
Till

On Tue, Jan 19, 2021 at 4:30 PM Seth Wiesman  wrote:

> Hi Everyone,
>
> I just merged in a PR to make the SQL / Table docs Blink planner only.
> Going forward, you do not need to mark something as Blink only or explain
> divergent semantics. Simply write the docs as if Blink were the only
> planner.
>
> There is a Legacy planner specific page[1]. If you add a feature that is
> not supported by the legacy planner, simply add a note to the legacy page
> stating as such.
>
> Seth
>
> [1]
>
> https://github.com/apache/flink/blob/master/docs/dev/table/legacy_planner.md
>


Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-19 Thread Till Rohrmann
Thanks for the responses Xintong and Stephan,

I agree that being able to define the resource requirements for a group of
operators is more user friendly. However, my concern is that we are
exposing thereby internal runtime strategies which might limit our
flexibility to execute a given job. Moreover, the semantics of configuring
resource requirements for SSGs could break if switching from streaming to
batch execution. If one defines the resource requirements for op_1 -> op_2
which run in pipelined mode when using the streaming execution, then how do
we interpret these requirements when op_1 -> op_2 are executed with a
blocking data exchange in batch execution mode? Consequently, I am still
leaning towards Stephan's proposal to set the resource requirements per
operator.

Maybe the following proposal makes the configuration easier: If the user
wants to use fine-grained resource requirements, then she needs to specify
the default size which is used for operators which have no explicit
resource annotation. If this holds true, then every operator would have a
resource requirement and the system can try to execute the operators in the
best possible manner w/o being constrained by how the user set the SSG
requirements.

Cheers,
Till

On Tue, Jan 19, 2021 at 9:09 AM Xintong Song  wrote:

> Thanks for the feedback, Stephan.
>
> Actually, your proposal has also come to my mind at some point. And I have
> some concerns about it.
>
>
> 1. It does not give users the same control as the SSG-based approach.
>
>
> While both approaches do not require specifying for each operator,
> SSG-based approach supports the semantic that "some operators together use
> this much resource" while the operator-based approach doesn't.
>
>
> Think of a long pipeline with m operators (o_1, o_2, ..., o_m), and at some
> point there's an agg o_n (1 < n < m) which significantly reduces the data
> amount. One can separate the pipeline into 2 groups SSG_1 (o_1, ..., o_n)
> and SSG_2 (o_n+1, ... o_m), so that configuring much higher parallelisms
> for operators in SSG_1 than for operators in SSG_2 won't lead to too much
> wasting of resources. If the two SSGs end up needing different resources,
> with the SSG-based approach one can directly specify resources for the two
> groups. However, with the operator-based approach, the user will have to
> specify resources for each operator in one of the two groups, and tune the
> default slot resource via configurations to fit the other group.
>
>
> 2. It increases the chance of breaking operator chains.
>
>
> Setting chainnable operators into different slot sharing groups will
> prevent them from being chained. In the current implementation, downstream
> operators, if SSG not explicitly specified, will be set to the same group
> as the chainable upstream operators (unless multiple upstream operators in
> different groups), to reduce the chance of breaking chains.
>
>
> Thinking of chainable operators o_1 -> o_2 -> o_3 -> o_3, deciding SSGs
> based on whether resource is specified we will easily get groups like (o_1,
> o_3) & (o_2, o_4), where none of the operators can be chained. This is also
> possible for the SSG-based approach, but I believe the chance is much
> smaller because there's no strong reason for users to specify the groups
> with alternate operators like that. We are more likely to get groups like
> (o_1, o_2) & (o_3, o_4), where the chain breaks only between o_2 and o_3.
>
>
> 3. It complicates the system by having two different mechanisms for sharing
> managed memory in  a slot.
>
>
> - In FLIP-141, we introduced the intra-slot managed memory sharing
> mechanism, where managed memory is first distributed according to the
> consumer type, then further distributed across operators of that consumer
> type.
>
> - With the operator-based approach, managed memory size specified for an
> operator should account for all the consumer types of that operator. That
> means the managed memory is first distributed across operators, then
> distributed to different consumer types of each operator.
>
>
> Unfortunately, the different order of the two calculation steps can lead to
> different results. To be specific, the semantic of the configuration option
> `consumer-weights` changed (within a slot vs. within an operator).
>
>
>
> To sum up things:
>
> While (3) might be a bit more implementation related, I think (1) and (2)
> somehow suggest that, the price for the proposed approach to avoid
> specifying resource for every operator is that it's not as independent from
> operator chaining and slot sharing as the operator-based approach discussed
> in the FLIP.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Tue, Jan 19, 2021 at 4:29 AM Stephan Ewen  wrote:
>
> > Thanks a lot, Yangze and Xintong for this FLIP.
> >
> > I want to say, first of all, that this is super well written. And the
> > points that the FLIP makes about how to expose the configuration to users
> > is exactly the right thing to figure 

[jira] [Created] (FLINK-21033) Remove PendingCheckpoint.statsCallback

2021-01-19 Thread Roman Khachatryan (Jira)
Roman Khachatryan created FLINK-21033:
-

 Summary: Remove PendingCheckpoint.statsCallback
 Key: FLINK-21033
 URL: https://issues.apache.org/jira/browse/FLINK-21033
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing, Runtime / Metrics
Affects Versions: 1.13.0
Reporter: Roman Khachatryan
Assignee: Roman Khachatryan
 Fix For: 1.13.0


During the offline discussion of FLINK-19462 with [~pnowojski] we decided to 
remove PendingCheckpoint.statsCallback. Instead of having it as a callback 
field, it can either be passed as an argument or even inlined into the caller 
(CheckpointCoordinator).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[Announce] SQL docs are now Blink only

2021-01-19 Thread Seth Wiesman
Hi Everyone,

I just merged in a PR to make the SQL / Table docs Blink planner only.
Going forward, you do not need to mark something as Blink only or explain
divergent semantics. Simply write the docs as if Blink were the only
planner.

There is a Legacy planner specific page[1]. If you add a feature that is
not supported by the legacy planner, simply add a note to the legacy page
stating as such.

Seth

[1]
https://github.com/apache/flink/blob/master/docs/dev/table/legacy_planner.md


Re: [DISCUSS] Flink configuration from environment variables

2021-01-19 Thread Till Rohrmann
I think a short FLIP would be awesome.

I guess this feature hasn't been implemented yet because it has not been
implemented yet ;-) I agree that this feature will improve configuration
ergonomics big time :-)

Cheers,
Till

On Tue, Jan 19, 2021 at 12:28 PM Ufuk Celebi  wrote:

> Hey all,
>
> I think that approach 2 is more idiomatic for container deployments where
> it can be cumbersome to manually map flink-conf.yaml contents to env vars
> [1]. The precedence order outlined by Till would also cover Steven's
> hierarchical overwrite requirement.
>
> I'm really excited about this feature as it will make Flink deployments a
> lot more ergonomic. The implementation seems to be not too complicated
> (which makes we wonder why we didn't tackle this earlier or whether I'm
> missing something).
>
> I'd also be happy to shepherd this contribution if there is consensus on
> the need for it and the approach. Does it make sense to formalize this
> decision a bit with a short FLIP?
>
> – Ufuk
>
> [1] In Ververica Platform, we support approach 1, because the Flink
> configuration is part of the specification for a single Deployment and it's
> minimally more convenient to have something like
>
> flinkConfiguration:
>   foo: ${BAR}
>
> for us. I don't think this approach would feel natural when manually
> deploying Flink. There would be a clear migration path for our customers,
> so I'm not concerned about this too much.
>
> On Tue, Jan 19, 2021, at 10:01 AM, Till Rohrmann wrote:
> > Hi everyone,
> >
> > Thanks for starting this discussion Ingo. I think being able to use env
> > variables to change Flink's configuration will be a very useful feature.
> >
> > Concerning the two approaches I would be in favour of the second approach
> > ($FLINK_CONFIG_S3_ACCESS_KEY) because it does not require the user to
> > prepare a special flink-conf.yaml where he inserts env variables for
> every
> > config value he wants to configure. Since this is not required with the
> > second approach, I think it is more general and easier to use. Also, the
> > user does not have to remember a second set of names (env names) which he
> > has to/can set.
> >
> > For how to substitute the values, I think it should happen when we load
> the
> > Flink configuration. First we read the file and then overwrite values
> > specified via an env variable or dynamic properties in some defined
> order.
> > For env.java.opts and other options which are used for starting the JVM
> we
> > might need special handling in the bash scripts.
> >
> > Cheers,
> > Till
> >
> > On Tue, Jan 19, 2021 at 9:46 AM Ingo Bürk  wrote:
> >
> > > Hi Yang,
> > >
> > > 1. As you said I think this doesn't affect Ververica Platform, really,
> so
> > > I'm more than happy to hear and follow the thoughts of people more
> > > experienced with Flink than me.
> > > 2. I wasn't aware of env.java.opts, but that's definitely a candidate
> where
> > > a user may want to "escape" it so it doesn't get substituted
> immediately, I
> > > agree.
> > >
> > >
> > > Regards
> > > Ingo
> > >
> > > On Tue, Jan 19, 2021 at 4:47 AM Yang Wang 
> wrote:
> > >
> > > > Hi Ingo,
> > > >
> > > > Thanks for your response.
> > > >
> > > > 1. Not distinguishing JM/TM is reasonable, but what about the client
> > > side.
> > > > For Yarn/K8s deployment,
> > > > the local flink-conf.yaml will be shipped to JM/TM. So I am just
> confused
> > > > about where should the environment
> > > > variables be replaced? IIUC, it is not an issue for Ververica
> Platform
> > > > since it is always done in the JM/TM side.
> > > >
> > > > 2. I believe we should support not do the substitution for specific
> key.
> > > A
> > > > typical use case is "env.java.opts". If the
> > > > value contains environment variables, they are expected to be
> replaced
> > > > exactly when the java command is executed,
> > > > not after the java process is started. Maybe escaping with single
> quote
> > > is
> > > > enough.
> > > >
> > > > 3. The substitution only takes effects on the value makes sense to
> me.
> > > >
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > Steven Wu  于2021年1月19日周二 上午12:36写道:
> > > >
> > > > > Variable substitution (proposed here) is definitely useful.
> > > > >
> > > > > For us, hierarchical override is more useful.  E.g., we may have
> the
> > > > > default value of "state.checkpoints.dir=path1" defined in
> > > > flink-conf.yaml.
> > > > > But maybe we want to override it to "state.checkpoints.dir=path2"
> via
> > > > > environment variable in some scenarios. Otherwise, we have to
> define a
> > > > > corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the
> Flink
> > > > > config, which is annoying.
> > > > >
> > > > > As Ingo pointed, it is also annoying to handle Java property key
> naming
> > > > > convention (dots separated), as dots aren't allowed in shell env
> var
> > > > naming
> > > > > (All caps, separated with underscore). Shell will complain. We
> have to
> > > > > bundle all env var overrides 

[jira] [Created] (FLINK-21032) JsonFileCompactionITCase fails on azure

2021-01-19 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-21032:


 Summary: JsonFileCompactionITCase fails on azure
 Key: FLINK-21032
 URL: https://issues.apache.org/jira/browse/FLINK-21032
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Ecosystem
Affects Versions: 1.13.0
Reporter: Dawid Wysakowicz
 Fix For: 1.13.0


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12230=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=03dca39c-73e8-5aaf-601d-328ae5c35f20

{code}
[ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 11.39 s 
<<< FAILURE! - in org.apache.flink.formats.json.JsonFileCompactionITCase
[ERROR] 
testSingleParallelism(org.apache.flink.formats.json.JsonFileCompactionITCase)  
Time elapsed: 1.21 s  <<< FAILURE!
java.lang.AssertionError: expected:<[+I[0, 0, 0], +I[0, 0, 0], +I[1, 1, 1], 
+I[1, 1, 1], +I[2, 2, 2], +I[2, 2, 2], +I[3, 3, 3], +I[3, 3, 3], +I[4, 4, 4], 
+I[4, 4, 4], +I[5, 5, 5], +I[5, 5, 5], +I[6, 6, 6], +I[6, 6, 6], +I[7, 7, 7], 
+I[7, 7, 7], +I[8, 8, 8], +I[8, 8, 8], +I[9, 9, 9], +I[9, 9, 9], +I[10, 0, 0], 
+I[10, 0, 0], +I[11, 1, 1], +I[11, 1, 1], +I[12, 2, 2], +I[12, 2, 2], +I[13, 3, 
3], +I[13, 3, 3], +I[14, 4, 4], +I[14, 4, 4], +I[15, 5, 5], +I[15, 5, 5], 
+I[16, 6, 6], +I[16, 6, 6], +I[17, 7, 7], +I[17, 7, 7], +I[18, 8, 8], +I[18, 8, 
8], +I[19, 9, 9], +I[19, 9, 9], +I[20, 0, 0], +I[20, 0, 0], +I[21, 1, 1], 
+I[21, 1, 1], +I[22, 2, 2], +I[22, 2, 2], +I[23, 3, 3], +I[23, 3, 3], +I[24, 4, 
4], +I[24, 4, 4], +I[25, 5, 5], +I[25, 5, 5], +I[26, 6, 6], +I[26, 6, 6], 
+I[27, 7, 7], +I[27, 7, 7], +I[28, 8, 8], +I[28, 8, 8], +I[29, 9, 9], +I[29, 9, 
9], +I[30, 0, 0], +I[30, 0, 0], +I[31, 1, 1], +I[31, 1, 1], +I[32, 2, 2], 
+I[32, 2, 2], +I[33, 3, 3], +I[33, 3, 3], +I[34, 4, 4], +I[34, 4, 4], +I[35, 5, 
5], +I[35, 5, 5], +I[36, 6, 6], +I[36, 6, 6], +I[37, 7, 7], +I[37, 7, 7], 
+I[38, 8, 8], +I[38, 8, 8], +I[39, 9, 9], +I[39, 9, 9], +I[40, 0, 0], +I[40, 0, 
0], +I[41, 1, 1], +I[41, 1, 1], +I[42, 2, 2], +I[42, 2, 2], +I[43, 3, 3], 
+I[43, 3, 3], +I[44, 4, 4], +I[44, 4, 4], +I[45, 5, 5], +I[45, 5, 5], +I[46, 6, 
6], +I[46, 6, 6], +I[47, 7, 7], +I[47, 7, 7], +I[48, 8, 8], +I[48, 8, 8], 
+I[49, 9, 9], +I[49, 9, 9], +I[50, 0, 0], +I[50, 0, 0], +I[51, 1, 1], +I[51, 1, 
1], +I[52, 2, 2], +I[52, 2, 2], +I[53, 3, 3], +I[53, 3, 3], +I[54, 4, 4], 
+I[54, 4, 4], +I[55, 5, 5], +I[55, 5, 5], +I[56, 6, 6], +I[56, 6, 6], +I[57, 7, 
7], +I[57, 7, 7], +I[58, 8, 8], +I[58, 8, 8], +I[59, 9, 9], +I[59, 9, 9], 
+I[60, 0, 0], +I[60, 0, 0], +I[61, 1, 1], +I[61, 1, 1], +I[62, 2, 2], +I[62, 2, 
2], +I[63, 3, 3], +I[63, 3, 3], +I[64, 4, 4], +I[64, 4, 4], +I[65, 5, 5], 
+I[65, 5, 5], +I[66, 6, 6], +I[66, 6, 6], +I[67, 7, 7], +I[67, 7, 7], +I[68, 8, 
8], +I[68, 8, 8], +I[69, 9, 9], +I[69, 9, 9], +I[70, 0, 0], +I[70, 0, 0], 
+I[71, 1, 1], +I[71, 1, 1], +I[72, 2, 2], +I[72, 2, 2], +I[73, 3, 3], +I[73, 3, 
3], +I[74, 4, 4], +I[74, 4, 4], +I[75, 5, 5], +I[75, 5, 5], +I[76, 6, 6], 
+I[76, 6, 6], +I[77, 7, 7], +I[77, 7, 7], +I[78, 8, 8], +I[78, 8, 8], +I[79, 9, 
9], +I[79, 9, 9], +I[80, 0, 0], +I[80, 0, 0], +I[81, 1, 1], +I[81, 1, 1], 
+I[82, 2, 2], +I[82, 2, 2], +I[83, 3, 3], +I[83, 3, 3], +I[84, 4, 4], +I[84, 4, 
4], +I[85, 5, 5], +I[85, 5, 5], +I[86, 6, 6], +I[86, 6, 6], +I[87, 7, 7], 
+I[87, 7, 7], +I[88, 8, 8], +I[88, 8, 8], +I[89, 9, 9], +I[89, 9, 9], +I[90, 0, 
0], +I[90, 0, 0], +I[91, 1, 1], +I[91, 1, 1], +I[92, 2, 2], +I[92, 2, 2], 
+I[93, 3, 3], +I[93, 3, 3], +I[94, 4, 4], +I[94, 4, 4], +I[95, 5, 5], +I[95, 5, 
5], +I[96, 6, 6], +I[96, 6, 6], +I[97, 7, 7], +I[97, 7, 7], +I[98, 8, 8], 
+I[98, 8, 8], +I[99, 9, 9], +I[99, 9, 9]]> but was:<[+I[0, 0, 0], +I[0, 0, 0], 
+I[1, 1, 1], +I[1, 1, 1], +I[2, 2, 2], +I[3, 3, 3], +I[3, 3, 3], +I[4, 4, 4], 
+I[4, 4, 4], +I[5, 5, 5], +I[6, 6, 6], +I[6, 6, 6], +I[7, 7, 7], +I[7, 7, 7], 
+I[8, 8, 8], +I[9, 9, 9], +I[9, 9, 9], +I[10, 0, 0], +I[10, 0, 0], +I[11, 1, 
1], +I[12, 2, 2], +I[12, 2, 2], +I[13, 3, 3], +I[13, 3, 3], +I[14, 4, 4], 
+I[15, 5, 5], +I[15, 5, 5], +I[16, 6, 6], +I[16, 6, 6], +I[17, 7, 7], +I[18, 8, 
8], +I[18, 8, 8], +I[19, 9, 9], +I[19, 9, 9], +I[20, 0, 0], +I[21, 1, 1], 
+I[21, 1, 1], +I[22, 2, 2], +I[22, 2, 2], +I[23, 3, 3], +I[24, 4, 4], +I[24, 4, 
4], +I[25, 5, 5], +I[25, 5, 5], +I[26, 6, 6], +I[27, 7, 7], +I[27, 7, 7], 
+I[28, 8, 8], +I[28, 8, 8], +I[29, 9, 9], +I[30, 0, 0], +I[30, 0, 0], +I[31, 1, 
1], +I[31, 1, 1], +I[32, 2, 2], +I[33, 3, 3], +I[33, 3, 3], +I[34, 4, 4], 
+I[34, 4, 4], +I[35, 5, 5], +I[36, 6, 6], +I[36, 6, 6], +I[37, 7, 7], +I[37, 7, 
7], +I[38, 8, 8], +I[39, 9, 9], +I[39, 9, 9], +I[40, 0, 0], +I[40, 0, 0], 
+I[41, 1, 1], +I[42, 2, 2], +I[42, 2, 2], +I[43, 3, 3], +I[43, 3, 3], +I[44, 4, 
4], +I[45, 5, 5], +I[45, 5, 5], +I[46, 6, 6], +I[46, 6, 6], +I[47, 7, 7], 
+I[48, 8, 8], +I[48, 8, 8], +I[49, 9, 9], +I[49, 9, 9], +I[50, 0, 0], +I[51, 1, 
1], +I[51, 1, 1], +I[52, 2, 2], +I[52, 2, 2], 

[jira] [Created] (FLINK-21031) JobMasterStopWithSavepointIT test is not run due to wrong name

2021-01-19 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-21031:


 Summary: JobMasterStopWithSavepointIT test is not run due to wrong 
name
 Key: FLINK-21031
 URL: https://issues.apache.org/jira/browse/FLINK-21031
 Project: Flink
  Issue Type: Bug
  Components: Tests
Reporter: Chesnay Schepler
 Fix For: 1.13.0


The {{JobMasterStopWithSavepointIT}} test is not run when testing via maven 
because the name is not ending on {{ITCase}}.
Coincidentally this test is currently failing, getting stuck when triggering a 
savepoint.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] Apache Flink 1.12.1 released

2021-01-19 Thread Wei Zhong
Thanks Xintong for the great work!

Best,
Wei

> 在 2021年1月19日,18:00,Guowei Ma  写道:
> 
> Thanks Xintong's effort!
> Best,
> Guowei
> 
> 
> On Tue, Jan 19, 2021 at 5:37 PM Yangze Guo  > wrote:
> Thanks Xintong for the great work!
> 
> Best,
> Yangze Guo
> 
> On Tue, Jan 19, 2021 at 4:47 PM Till Rohrmann  > wrote:
> >
> > Thanks a lot for driving this release Xintong. This was indeed a release 
> > with some obstacles to overcome and you did it very well!
> >
> > Cheers,
> > Till
> >
> > On Tue, Jan 19, 2021 at 5:59 AM Xingbo Huang  > > wrote:
> >>
> >> Thanks Xintong for the great work!
> >>
> >> Best,
> >> Xingbo
> >>
> >> Peter Huang  >> > 于2021年1月19日周二 下午12:51写道:
> >>
> >> > Thanks for the great effort to make this happen. It paves us from using
> >> > 1.12 soon.
> >> >
> >> > Best Regards
> >> > Peter Huang
> >> >
> >> > On Mon, Jan 18, 2021 at 8:16 PM Yang Wang  >> > > wrote:
> >> >
> >> > > Thanks Xintong for the great work as our release manager!
> >> > >
> >> > >
> >> > > Best,
> >> > > Yang
> >> > >
> >> > > Xintong Song mailto:xts...@apache.org>> 
> >> > > 于2021年1月19日周二 上午11:53写道:
> >> > >
> >> > >> The Apache Flink community is very happy to announce the release of
> >> > >> Apache Flink 1.12.1, which is the first bugfix release for the Apache
> >> > Flink
> >> > >> 1.12 series.
> >> > >>
> >> > >> Apache Flink® is an open-source stream processing framework for
> >> > >> distributed, high-performing, always-available, and accurate data
> >> > streaming
> >> > >> applications.
> >> > >>
> >> > >> The release is available for download at:
> >> > >> https://flink.apache.org/downloads.html 
> >> > >> 
> >> > >>
> >> > >> Please check out the release blog post for an overview of the
> >> > >> improvements for this bugfix release:
> >> > >> https://flink.apache.org/news/2021/01/19/release-1.12.1.html 
> >> > >> 
> >> > >>
> >> > >> The full release notes are available in Jira:
> >> > >>
> >> > >>
> >> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12349459
> >> >  
> >> > 
> >> > >>
> >> > >> We would like to thank all contributors of the Apache Flink community
> >> > who
> >> > >> made this release possible!
> >> > >>
> >> > >> Regards,
> >> > >> Xintong
> >> > >>
> >> > >
> >> >



Re: [DISCUSS] Flink configuration from environment variables

2021-01-19 Thread Ufuk Celebi
Hey all,

I think that approach 2 is more idiomatic for container deployments where it 
can be cumbersome to manually map flink-conf.yaml contents to env vars [1]. The 
precedence order outlined by Till would also cover Steven's hierarchical 
overwrite requirement.

I'm really excited about this feature as it will make Flink deployments a lot 
more ergonomic. The implementation seems to be not too complicated (which makes 
we wonder why we didn't tackle this earlier or whether I'm missing something).

I'd also be happy to shepherd this contribution if there is consensus on the 
need for it and the approach. Does it make sense to formalize this decision a 
bit with a short FLIP?

– Ufuk

[1] In Ververica Platform, we support approach 1, because the Flink 
configuration is part of the specification for a single Deployment and it's 
minimally more convenient to have something like

flinkConfiguration:
  foo: ${BAR}

for us. I don't think this approach would feel natural when manually deploying 
Flink. There would be a clear migration path for our customers, so I'm not 
concerned about this too much.

On Tue, Jan 19, 2021, at 10:01 AM, Till Rohrmann wrote:
> Hi everyone,
> 
> Thanks for starting this discussion Ingo. I think being able to use env
> variables to change Flink's configuration will be a very useful feature.
> 
> Concerning the two approaches I would be in favour of the second approach
> ($FLINK_CONFIG_S3_ACCESS_KEY) because it does not require the user to
> prepare a special flink-conf.yaml where he inserts env variables for every
> config value he wants to configure. Since this is not required with the
> second approach, I think it is more general and easier to use. Also, the
> user does not have to remember a second set of names (env names) which he
> has to/can set.
> 
> For how to substitute the values, I think it should happen when we load the
> Flink configuration. First we read the file and then overwrite values
> specified via an env variable or dynamic properties in some defined order.
> For env.java.opts and other options which are used for starting the JVM we
> might need special handling in the bash scripts.
> 
> Cheers,
> Till
> 
> On Tue, Jan 19, 2021 at 9:46 AM Ingo Bürk  wrote:
> 
> > Hi Yang,
> >
> > 1. As you said I think this doesn't affect Ververica Platform, really, so
> > I'm more than happy to hear and follow the thoughts of people more
> > experienced with Flink than me.
> > 2. I wasn't aware of env.java.opts, but that's definitely a candidate where
> > a user may want to "escape" it so it doesn't get substituted immediately, I
> > agree.
> >
> >
> > Regards
> > Ingo
> >
> > On Tue, Jan 19, 2021 at 4:47 AM Yang Wang  wrote:
> >
> > > Hi Ingo,
> > >
> > > Thanks for your response.
> > >
> > > 1. Not distinguishing JM/TM is reasonable, but what about the client
> > side.
> > > For Yarn/K8s deployment,
> > > the local flink-conf.yaml will be shipped to JM/TM. So I am just confused
> > > about where should the environment
> > > variables be replaced? IIUC, it is not an issue for Ververica Platform
> > > since it is always done in the JM/TM side.
> > >
> > > 2. I believe we should support not do the substitution for specific key.
> > A
> > > typical use case is "env.java.opts". If the
> > > value contains environment variables, they are expected to be replaced
> > > exactly when the java command is executed,
> > > not after the java process is started. Maybe escaping with single quote
> > is
> > > enough.
> > >
> > > 3. The substitution only takes effects on the value makes sense to me.
> > >
> > >
> > > Best,
> > > Yang
> > >
> > > Steven Wu  于2021年1月19日周二 上午12:36写道:
> > >
> > > > Variable substitution (proposed here) is definitely useful.
> > > >
> > > > For us, hierarchical override is more useful.  E.g., we may have the
> > > > default value of "state.checkpoints.dir=path1" defined in
> > > flink-conf.yaml.
> > > > But maybe we want to override it to "state.checkpoints.dir=path2" via
> > > > environment variable in some scenarios. Otherwise, we have to define a
> > > > corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the Flink
> > > > config, which is annoying.
> > > >
> > > > As Ingo pointed, it is also annoying to handle Java property key naming
> > > > convention (dots separated), as dots aren't allowed in shell env var
> > > naming
> > > > (All caps, separated with underscore). Shell will complain. We have to
> > > > bundle all env var overrides (k-v pairs) in a single property value
> > (JSON
> > > > and base64 encode) to avoid it.
> > > >
> > > > On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk  wrote:
> > > >
> > > > > Hi Yang,
> > > > >
> > > > > thanks for your questions! I'm glad to see this feature is being
> > > received
> > > > > positively.
> > > > >
> > > > > ad 1) We don't distinguish JM/TM, and I can't think of a good reason
> > > why
> > > > a
> > > > > user would want to do so. I'm not very experienced with Flink,
> > however,
> > > > so
> > > > > 

[jira] [Created] (FLINK-21030) Broken job restart for job with disjoint graph

2021-01-19 Thread Theo Diefenthal (Jira)
Theo Diefenthal created FLINK-21030:
---

 Summary: Broken job restart for job with disjoint graph
 Key: FLINK-21030
 URL: https://issues.apache.org/jira/browse/FLINK-21030
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.11.2
Reporter: Theo Diefenthal


Building on top of bugs:

https://issues.apache.org/jira/browse/FLINK-21028

 and https://issues.apache.org/jira/browse/FLINK-21029 : 

I tried to stop a Flink application on YARN via savepoint which didn't succeed 
due to a possible bug/racecondition in shutdown (Bug 21028). Due to some 
reason, Flink attempted to restart the pipeline after the failure in shutdown 
(21029). The bug here:

As I mentioned: My jobgraph is disjoint and the pipelines are fully isolated. 
Lets say the original error occured in a single task of pipeline1. Flink then 
restarted the entire pipeline1, but pipeline2 was shutdown successfully and 
switched the state to FINISHED.

My job thus was in kind of an invalid state after the attempt to stopping: One 
of two pipelines was running, the other was FINISHED. I guess this is kind of a 
bug in the restarting behavior that only all connected components of a graph 
are restarted, but the others aren't...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21029) Failure of shutdown lead to restart of (connected) pipeline

2021-01-19 Thread Theo Diefenthal (Jira)
Theo Diefenthal created FLINK-21029:
---

 Summary: Failure of shutdown lead to restart of (connected) 
pipeline
 Key: FLINK-21029
 URL: https://issues.apache.org/jira/browse/FLINK-21029
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.11.2
Reporter: Theo Diefenthal


This bug happened in combination with 
https://issues.apache.org/jira/browse/FLINK-21028 .

When I wanted to stop a job via CLI "flink stop..." with disjoint job graph 
(independent pipelines in the graph), one task wan't able to stop properly 
(Reported in mentioned bug). This lead to restarting the job. I think, this is 
a wrong behavior in general and a separated bug:

If any crash occurs on (trying) to stop a job, Flink shouldn't try to restart 
but continue stopping the job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21028) Streaming application didn't stop properly

2021-01-19 Thread Theo Diefenthal (Jira)
Theo Diefenthal created FLINK-21028:
---

 Summary: Streaming application didn't stop properly 
 Key: FLINK-21028
 URL: https://issues.apache.org/jira/browse/FLINK-21028
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.11.2
Reporter: Theo Diefenthal


I have a Flink job running on YARN with a disjoint graph, i.e. a single job 
contains two independent and isolated pipelines.

>From time to time, I stop the job with a savepoint like so:
{code:java}
flink stop -p ${SAVEPOINT_BASEDIR}/${FLINK_JOB_NAME}/SAVEPOINTS 
--yarnapplicationId=${FLINK_YARN_APPID} ${ID}{code}
A few days ago, this job suddenly didn't stop properly as usual but ran into a 
possible race condition.

On the CLI with stop, I received a simple timeout:
{code:java}
org.apache.flink.util.FlinkException: Could not stop with a savepoint job 
"f23290bf5fb0ecd49a4455e4a65f2eb6".
 at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:495)
 at 
org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:864)
 at org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:487)
 at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:931)
 at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
 at 
org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
 at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992)
Caused by: java.util.concurrent.TimeoutException
 at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771)
 at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
 at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:493)
 ... 9 more{code}
 

The root of the problem however is that on a taskmanager, I received an 
exception in shutdown, which lead to restarting (a part) of the pipeline and 
put it back to running state, thus the console command for stopping timed out 
(as the job was (partially) back in running state). the exception which looks 
like a race condition for me in the logs is:
{code:java}
2021-01-12T06:15:15.827877+01:00 WARN org.apache.flink.runtime.taskmanager.Task 
Source: rawdata_source1 -> validation_source1 -> enrich_source1 -> 
map_json_source1 -> Sink: write_to_kafka_source1) (3/18) 
(bc68320cf69dd877782417a3298499d6) switched from RUNNING to FAILED.
java.util.concurrent.ExecutionException: 
org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: 
Could not forward element to next operator
 at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
 at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
 at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.quiesceTimeServiceAndCloseOperator(StreamOperatorWrapper.java:161)
 at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:130)
 at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:134)
 at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:134)
 at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:134)
 at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:134)
 at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:134)
 at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:80)
 at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.closeOperators(OperatorChain.java:302)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.afterInvoke(StreamTask.java:576)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:544)
 at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
 at java.lang.Thread.run(Thread.java:745)
Caused by: 
org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: 
Could not forward element to next operator
 at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.emitWatermark(OperatorChain.java:642)
 at 
org.apache.flink.streaming.api.operators.CountingOutput.emitWatermark(CountingOutput.java:41)
 at 
org.apache.flink.streaming.runtime.operators.TimestampsAndWatermarksOperator$WatermarkEmitter.emitWatermark(TimestampsAndWatermarksOperator.java:165)
 at 

[jira] [Created] (FLINK-21027) Add isKeyValueImmutable() method to KeyedStateBackend interface

2021-01-19 Thread Jark Wu (Jira)
Jark Wu created FLINK-21027:
---

 Summary: Add isKeyValueImmutable() method to KeyedStateBackend 
interface
 Key: FLINK-21027
 URL: https://issues.apache.org/jira/browse/FLINK-21027
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / State Backends
Reporter: Jark Wu


In Table/SQL operators, we have some optimizations that reuse objects of keys 
and records. For example, we buffer input records in {{BytesMultiMap}} and use 
the reused object to map to the underlying memory segment to reduce bytes copy. 

However, if we put the reused key and value into Heap statebackend, the result 
will be wrong, because it is not allowed to mutate keys and values in Heap 
statebackend. 

Therefore, it would be great if {{KeyedStateBackend}} can expose such API, so 
that Table/SQL can dynamically decide whether to copy the keys and values 
before putting into state. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Correct time-related function behavior in Flink SQL

2021-01-19 Thread Leonard Xu
I found above example format may mess up in different mail client, I post a 
picture here[1].

Best,
Leonard

[1] 
https://github.com/leonardBang/flink-sql-etl/blob/master/etl-job/src/main/resources/pictures/CURRRENT_TIMESTAMP.png
 

 

> 在 2021年1月19日,16:22,Leonard Xu  写道:
> 
> Hi, all
> 
> I want to start the discussion about correcting time-related function 
> behavior in Flink SQL, this is a tricky topic but I think it’s time to 
> address it. 
> 
> Currently some temporal function behaviors are wired to users.
> 1.  When users use a PROCTIME() in SQL, the value of PROCTIME() has a 
> timezone offset with the wall-clock time in users' local time zone, users 
> need to add their local time zone offset manually to get expected local 
> timestamp(e.g: Users in Germany need to +1h to get expected local timestamp). 
> 
> 2. Users can not use CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP  to get 
> wall-clock timestamp in local time zone, and thus they need write UDF in 
> their SQL just for implementing a simple filter like WHERE date_col =  
> CURRENT_DATE. 
> 
> 3. Another common case  is the time window  with day interval based on 
> PROCTIME(), user plan to put all data from one day into the same window, but 
> the window is assigned using timestamp in UTC+0 timezone rather than the 
> session timezone which leads to the window starts with an offset(e.g: Users 
> in China need to add -8h in their business sql start and then +8h when output 
> the result, the conversion like a magic for users). 
> 
> These problems come from that lots of time-related functions like PROCTIME(), 
> NOW(), CURRENT_DATE, CURRENT_TIME and CURRENT_TIMESTAMP are returning time 
> values based on UTC+0 time zone.
> 
> This topic will lead to a comparison of the three types, i.e. 
> TIMESTAMP/TIMESTAMP WITHOUT TIME ZONE, TIMESTAMP WITH LOCAL TIME ZONE and 
> TIMESTAMP WITH TIME ZONE. In order to better understand the three types, I 
> wrote a document[1] to help understand them better. You can also know the 
> tree timestamp types behavior in Hadoop ecosystem from the reference link int 
> the doc.
> 
> 
> I Invested all Flink time-related functions current behavior and compared 
> with other DB vendors like Pg,Presto, Hive, Spark, Snowflake,  I made an 
> excel [2] to organize them well, we can use it for the next discussion. 
> Please let me know if I missed something.
> From my investigation, I think we need to correct the behavior of function 
> NOW()/PROCTIME()/CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP, to correct 
> them, we can change the function return type or function return value or 
> change return type and return value both. All of those way are valid because 
> SQL:2011 does not specify the function return type and every SQL engine 
> vendor has its own implementation. For example the CURRENT_TIMESTAMP function,
> 
> FLINK current behaviorexisted problem other vendors' behavior 
> proposed change
> CURRENT_TIMESTAMP CURRENT_TIMESTAMP
> TIMESTAMP(0) NOT NULL
> 
> #session timezone: UTC
> 2020-12-28T23:52:52
> 
> #session timezone: UTC+8
> 2020-12-28T23:52:52
> 
> wall clock:
> UTC+8: 2020-12-29 07:52:52Wrong value:returns UTC timestamp, but user 
> expects current timestamp in session time zone  In MySQL, Spark, the 
> function NOW() and CURRENT_TIMESTAMP return current timestamp value in 
> session time zone,the return type is TIMESTAMP
> 
> In Pg, Presto, the function NOW() and LOCALTIMESTAMP return current timestamp 
> in session time zone,the return type is TIMESTAMP WITH TIME ZONE
> 
> In Snowflake, the function CURRENT_TIMESTAMP / LOCALTIMESTAMP return current 
> timestamp in session time zone,the return type is TIMESTAMP WITH LOCAL TIME 
> ZONE Flink should return current timestamp in session time zone, the return 
> type should be TIMESTAMP
> 
> 
> I tend to only change the return value for these problematic functions and 
> introduce an option for compatibility consideration, what do you think?
> 
> 
> Looking forward to your feedback.
> 
> Best,
> Leonard
> 
> [1] 
> https://docs.google.com/document/d/1iY3eatV8LBjmF0gWh2JYrQR0FlTadsSeuCsksOVp_iA/edit?usp=sharing
>  
> 
>  
> [2] 
> https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing
>  
> 
>  



Re: [ANNOUNCE] Apache Flink 1.12.1 released

2021-01-19 Thread Guowei Ma
Thanks Xintong's effort!
Best,
Guowei


On Tue, Jan 19, 2021 at 5:37 PM Yangze Guo  wrote:

> Thanks Xintong for the great work!
>
> Best,
> Yangze Guo
>
> On Tue, Jan 19, 2021 at 4:47 PM Till Rohrmann 
> wrote:
> >
> > Thanks a lot for driving this release Xintong. This was indeed a release
> with some obstacles to overcome and you did it very well!
> >
> > Cheers,
> > Till
> >
> > On Tue, Jan 19, 2021 at 5:59 AM Xingbo Huang  wrote:
> >>
> >> Thanks Xintong for the great work!
> >>
> >> Best,
> >> Xingbo
> >>
> >> Peter Huang  于2021年1月19日周二 下午12:51写道:
> >>
> >> > Thanks for the great effort to make this happen. It paves us from
> using
> >> > 1.12 soon.
> >> >
> >> > Best Regards
> >> > Peter Huang
> >> >
> >> > On Mon, Jan 18, 2021 at 8:16 PM Yang Wang 
> wrote:
> >> >
> >> > > Thanks Xintong for the great work as our release manager!
> >> > >
> >> > >
> >> > > Best,
> >> > > Yang
> >> > >
> >> > > Xintong Song  于2021年1月19日周二 上午11:53写道:
> >> > >
> >> > >> The Apache Flink community is very happy to announce the release of
> >> > >> Apache Flink 1.12.1, which is the first bugfix release for the
> Apache
> >> > Flink
> >> > >> 1.12 series.
> >> > >>
> >> > >> Apache Flink® is an open-source stream processing framework for
> >> > >> distributed, high-performing, always-available, and accurate data
> >> > streaming
> >> > >> applications.
> >> > >>
> >> > >> The release is available for download at:
> >> > >> https://flink.apache.org/downloads.html
> >> > >>
> >> > >> Please check out the release blog post for an overview of the
> >> > >> improvements for this bugfix release:
> >> > >> https://flink.apache.org/news/2021/01/19/release-1.12.1.html
> >> > >>
> >> > >> The full release notes are available in Jira:
> >> > >>
> >> > >>
> >> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12349459
> >> > >>
> >> > >> We would like to thank all contributors of the Apache Flink
> community
> >> > who
> >> > >> made this release possible!
> >> > >>
> >> > >> Regards,
> >> > >> Xintong
> >> > >>
> >> > >
> >> >
>


Re: [Vote] FLIP-157 Migrate Flink Documentation from Jekyll to Hugo

2021-01-19 Thread Leonard Xu
+1

Best,
Leonard

> 在 2021年1月19日,17:32,David Anderson  写道:
> 
> +1
> 
> David
> 
> On Tue, Jan 19, 2021 at 5:28 AM Forward Xu  wrote:
> 
>> +1
>> 
>> Dian Fu  于2021年1月19日周二 上午11:40写道:
>> 
>>> +1
>>> 
 在 2021年1月19日,上午11:34,Jark Wu  写道:
 
 +1
 
 On Tue, 19 Jan 2021 at 01:59, Till Rohrmann 
>>> wrote:
 
> +1,
> 
> Cheers,
> Till
> 
> On Mon, Jan 18, 2021 at 4:12 PM Chesnay Schepler 
> wrote:
> 
>> +1
>> On 1/18/2021 3:50 PM, Seth Wiesman wrote:
>>> Addendum, 72 hours from now is Thursday the 21st :)
>>> 
>>> sorry for the mistake.
>>> 
>>> Seth
>>> 
>>> On Mon, Jan 18, 2021 at 8:41 AM Timo Walther 
> wrote:
>>> 
 +1
 
 Thanks for upgrading our docs infrastructure.
 
 Regards,
 Timo
 
 On 18.01.21 15:29, Seth Wiesman wrote:
> Hi devs,
> 
> The discussion of the FLIP-157 [1] seems has reached a consensus
>> through
> the mailing thread [2]. I would like to start a vote for it.
> 
> The vote will be opened until 20th January (72h), unless there is
>> an
> objection or no enough votes.
> 
> Best,
> Seth
> 
> [1]
> 
 
>> 
> 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-157+Migrate+Flink+Documentation+from+Jekyll+to+Hugo
> [2]
> 
 
>> 
> 
>>> 
>> https://lists.apache.org/thread.html/r88152bf178381c5e3bc2d7b3554cea3d61cff9ac0edb713dc518d9c7%40%3Cdev.flink.apache.org%3E
 
>> 
>> 
> 
>>> 
>>> 
>> 



Re: [ANNOUNCE] Apache Flink 1.12.1 released

2021-01-19 Thread Yangze Guo
Thanks Xintong for the great work!

Best,
Yangze Guo

On Tue, Jan 19, 2021 at 4:47 PM Till Rohrmann  wrote:
>
> Thanks a lot for driving this release Xintong. This was indeed a release with 
> some obstacles to overcome and you did it very well!
>
> Cheers,
> Till
>
> On Tue, Jan 19, 2021 at 5:59 AM Xingbo Huang  wrote:
>>
>> Thanks Xintong for the great work!
>>
>> Best,
>> Xingbo
>>
>> Peter Huang  于2021年1月19日周二 下午12:51写道:
>>
>> > Thanks for the great effort to make this happen. It paves us from using
>> > 1.12 soon.
>> >
>> > Best Regards
>> > Peter Huang
>> >
>> > On Mon, Jan 18, 2021 at 8:16 PM Yang Wang  wrote:
>> >
>> > > Thanks Xintong for the great work as our release manager!
>> > >
>> > >
>> > > Best,
>> > > Yang
>> > >
>> > > Xintong Song  于2021年1月19日周二 上午11:53写道:
>> > >
>> > >> The Apache Flink community is very happy to announce the release of
>> > >> Apache Flink 1.12.1, which is the first bugfix release for the Apache
>> > Flink
>> > >> 1.12 series.
>> > >>
>> > >> Apache Flink® is an open-source stream processing framework for
>> > >> distributed, high-performing, always-available, and accurate data
>> > streaming
>> > >> applications.
>> > >>
>> > >> The release is available for download at:
>> > >> https://flink.apache.org/downloads.html
>> > >>
>> > >> Please check out the release blog post for an overview of the
>> > >> improvements for this bugfix release:
>> > >> https://flink.apache.org/news/2021/01/19/release-1.12.1.html
>> > >>
>> > >> The full release notes are available in Jira:
>> > >>
>> > >>
>> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12349459
>> > >>
>> > >> We would like to thank all contributors of the Apache Flink community
>> > who
>> > >> made this release possible!
>> > >>
>> > >> Regards,
>> > >> Xintong
>> > >>
>> > >
>> >


Re: [Vote] FLIP-157 Migrate Flink Documentation from Jekyll to Hugo

2021-01-19 Thread David Anderson
+1

David

On Tue, Jan 19, 2021 at 5:28 AM Forward Xu  wrote:

> +1
>
> Dian Fu  于2021年1月19日周二 上午11:40写道:
>
> > +1
> >
> > > 在 2021年1月19日,上午11:34,Jark Wu  写道:
> > >
> > > +1
> > >
> > > On Tue, 19 Jan 2021 at 01:59, Till Rohrmann 
> > wrote:
> > >
> > >> +1,
> > >>
> > >> Cheers,
> > >> Till
> > >>
> > >> On Mon, Jan 18, 2021 at 4:12 PM Chesnay Schepler 
> > >> wrote:
> > >>
> > >>> +1
> > >>> On 1/18/2021 3:50 PM, Seth Wiesman wrote:
> >  Addendum, 72 hours from now is Thursday the 21st :)
> > 
> >  sorry for the mistake.
> > 
> >  Seth
> > 
> >  On Mon, Jan 18, 2021 at 8:41 AM Timo Walther 
> > >> wrote:
> > 
> > > +1
> > >
> > > Thanks for upgrading our docs infrastructure.
> > >
> > > Regards,
> > > Timo
> > >
> > > On 18.01.21 15:29, Seth Wiesman wrote:
> > >> Hi devs,
> > >>
> > >> The discussion of the FLIP-157 [1] seems has reached a consensus
> > >>> through
> > >> the mailing thread [2]. I would like to start a vote for it.
> > >>
> > >> The vote will be opened until 20th January (72h), unless there is
> an
> > >> objection or no enough votes.
> > >>
> > >> Best,
> > >> Seth
> > >>
> > >> [1]
> > >>
> > >
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-157+Migrate+Flink+Documentation+from+Jekyll+to+Hugo
> > >> [2]
> > >>
> > >
> > >>>
> > >>
> >
> https://lists.apache.org/thread.html/r88152bf178381c5e3bc2d7b3554cea3d61cff9ac0edb713dc518d9c7%40%3Cdev.flink.apache.org%3E
> > >
> > >>>
> > >>>
> > >>
> >
> >
>


Re: Re: [DISCUSS] Dealing with deprecated and legacy code in Flink

2021-01-19 Thread Till Rohrmann
Thanks a lot for starting this discussion Timo. I like the idea of setting
more explicit guidelines for deprecating functionality.

I really like the idea of adding with the @Deprecated annotation since when
the function is deprecated. Based on that one can simply search for
features which should be removed in a given release. Alternatively, one
could as you said also state the removal version.

I think what also works is to directly create a critical JIRA issue with
removing functionality as soon as one deprecates something. The problem was
often that after deprecating something, it gets forgotten.

For dropping connectors I am a bit uncertain. From a project management
perspective it sounds like a good idea to not have to support connectors
which are no longer supported for some time. However, what if this
connector is still very popular and in heavy use by our users? Just because
an external system or a version of it is no longer maintained does not mean
that the system is no longer used. I think our current practice with trying
to judge whether our users still use this feature/connector works somewhat.
On the other hand, having these guidelines would probably make it easier to
argue to remove something even if there are still a couple of users.

Cheers,
Till

On Mon, Jan 18, 2021 at 11:37 AM Yun Gao 
wrote:

> Hi,
>
> Very thanks for @Timo to initiate the discussion!
>
> I would also +1 for providing some informations to users via annotations
> or documents in advanced to not suprise users before we actually remove
> the legacy code.
> If we finally decide to change one functionality that user could sense,
> perhaps one
> premise is that Flink has provided a replacement for that one and users
> could transfer their
> applications easily. Then we might also consider have one dedicated
> documentation page
> to list the functionalities to change and how users could do the transfer.
>
> To make the decision of whether to remove some legacy code, we might also
> consider to have a survey
> like the one we did for mesos support [1] to see how this functionality is
> used.
>
> Best,
>  Yun
>
>
> [1]
> https://lists.apache.org/thread.html/r139b11190a6d1f09c9e44d5fa985fd8d310347e66d2324ec1f0c2d87%40%3Cuser.flink.apache.org%3E
>
>
>
>  --Original Mail --
> Sender:Piotr Nowojski 
> Send Date:Mon Jan 18 18:23:36 2021
> Recipients:dev 
> Subject:Re: [DISCUSS] Dealing with deprecated and legacy code in Flink
> Hi Timo,
>
> Thanks for starting this discussion. I'm not sure how we should approach
> this topic and what should be our final recommendation, but definitely
> clearing up a couple of things would be helpful.
>
> For starters, I agree it would be good to have some more information,
> besides just "@Deprecated" annotations. Is it possible to extend
> annotations with informations like:
> - from which version was it deprecated
> - when is it planned to be removed (we could always mark `2.0` as "never"
> ;) )
> - add maybe some pre/post release step of verifying that removal has
> actually happened?
>
> ?
>
> On the other hand, I think it's very important to maintain backward
> compatibility with Flink as much as possible. As a developer I don't
> like dealing with this, but as a user I hate dealing with incompatible
> upgrades even more. So all in all, I would be in favour of putting more
> effort not in deprecating and removing APIs, but making sure that they are
> stable.
>
> Stephan Ewan also raised a point sometime ago, that in the recent past, we
> developed a habit of marking everything as `@Experimental` or
> `@PublicEvolving` and leaving it as that forever. Maybe we should also
> include deadlines (2 releases since introduction?) for changing
> `@Experimental`/`@PublicEvolving` into `@Public` in this kind of
> guidelines/automated checks?
>
> Piotrek
>
> pt., 15 sty 2021 o 13:56 Timo Walther  napisał(a):
>
> > Hi everyone,
> >
> > I would like to start a discussion how we treat deprecated and legacy
> > code in Flink in the future. During the last years, our code base has
> > grown quite a bit and a couple of interfaces and components have been
> > reworked on the way.
> >
> > I'm sure each component has a few legacy parts that are waiting for
> > removal. Apart from keeping outdated API around for a couple of releases
> > until users have updated their code, it is also often easier to just put
> > a @Deprecation annotation and postpone the actual change.
> >
> > When looking at the current code, we have duplicated SQL planners,
> > duplicated APIs (DataSet/DataStream), duplicated source/sink interfaces,
> > outdated connectors (Elasticsearch 5?) and dependencies (Scala 2.11?).
> >
> > I'm wondering whether we should come up with some legacy/deprecation
> > guidelines for the future.
> >
> > Some examples:
> >
> > - I could imagine new Flink-specific annotations for documenting (in
> > code) in which version an interface was deprecated and when the planned
> > removal 

Re: [DISCUSS] Flink configuration from environment variables

2021-01-19 Thread Till Rohrmann
Hi everyone,

Thanks for starting this discussion Ingo. I think being able to use env
variables to change Flink's configuration will be a very useful feature.

Concerning the two approaches I would be in favour of the second approach
($FLINK_CONFIG_S3_ACCESS_KEY) because it does not require the user to
prepare a special flink-conf.yaml where he inserts env variables for every
config value he wants to configure. Since this is not required with the
second approach, I think it is more general and easier to use. Also, the
user does not have to remember a second set of names (env names) which he
has to/can set.

For how to substitute the values, I think it should happen when we load the
Flink configuration. First we read the file and then overwrite values
specified via an env variable or dynamic properties in some defined order.
For env.java.opts and other options which are used for starting the JVM we
might need special handling in the bash scripts.

Cheers,
Till

On Tue, Jan 19, 2021 at 9:46 AM Ingo Bürk  wrote:

> Hi Yang,
>
> 1. As you said I think this doesn't affect Ververica Platform, really, so
> I'm more than happy to hear and follow the thoughts of people more
> experienced with Flink than me.
> 2. I wasn't aware of env.java.opts, but that's definitely a candidate where
> a user may want to "escape" it so it doesn't get substituted immediately, I
> agree.
>
>
> Regards
> Ingo
>
> On Tue, Jan 19, 2021 at 4:47 AM Yang Wang  wrote:
>
> > Hi Ingo,
> >
> > Thanks for your response.
> >
> > 1. Not distinguishing JM/TM is reasonable, but what about the client
> side.
> > For Yarn/K8s deployment,
> > the local flink-conf.yaml will be shipped to JM/TM. So I am just confused
> > about where should the environment
> > variables be replaced? IIUC, it is not an issue for Ververica Platform
> > since it is always done in the JM/TM side.
> >
> > 2. I believe we should support not do the substitution for specific key.
> A
> > typical use case is "env.java.opts". If the
> > value contains environment variables, they are expected to be replaced
> > exactly when the java command is executed,
> > not after the java process is started. Maybe escaping with single quote
> is
> > enough.
> >
> > 3. The substitution only takes effects on the value makes sense to me.
> >
> >
> > Best,
> > Yang
> >
> > Steven Wu  于2021年1月19日周二 上午12:36写道:
> >
> > > Variable substitution (proposed here) is definitely useful.
> > >
> > > For us, hierarchical override is more useful.  E.g., we may have the
> > > default value of "state.checkpoints.dir=path1" defined in
> > flink-conf.yaml.
> > > But maybe we want to override it to "state.checkpoints.dir=path2" via
> > > environment variable in some scenarios. Otherwise, we have to define a
> > > corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the Flink
> > > config, which is annoying.
> > >
> > > As Ingo pointed, it is also annoying to handle Java property key naming
> > > convention (dots separated), as dots aren't allowed in shell env var
> > naming
> > > (All caps, separated with underscore). Shell will complain. We have to
> > > bundle all env var overrides (k-v pairs) in a single property value
> (JSON
> > > and base64 encode) to avoid it.
> > >
> > > On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk  wrote:
> > >
> > > > Hi Yang,
> > > >
> > > > thanks for your questions! I'm glad to see this feature is being
> > received
> > > > positively.
> > > >
> > > > ad 1) We don't distinguish JM/TM, and I can't think of a good reason
> > why
> > > a
> > > > user would want to do so. I'm not very experienced with Flink,
> however,
> > > so
> > > > please excuse me if I'm overlooking some obvious reason here. :-)
> > > > ad 2) Admittedly I don't have a good overview on all the
> configuration
> > > > options that exist, but from those that I do know I can't imagine
> > someone
> > > > wanting to pass a value like "${MY_VAR}" verbatim. In Ververica
> > Platform
> > > as
> > > > of now we ignore this problem. If, however, this needs to be
> > addressed, a
> > > > possible solution could be to allow escaping syntax such as
> > "\${MY_VAR}".
> > > >
> > > > Another point to consider here is when exactly the substitution takes
> > > > place: on the "raw" file, or on the parsed key / value separately,
> and
> > if
> > > > so, should it support both key and value? My current thinking is that
> > > > substituting only the value of the parsed entry should be sufficient.
> > > >
> > > >
> > > > Regards
> > > > Ingo
> > > >
> > > > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang 
> > wrote:
> > > >
> > > > > Thanks for kicking off the discussion.
> > > > >
> > > > > I think supporting environment variables rendering in the Flink
> > > > > configuration yaml file is a good idea. Especially for
> > > > > the Kubernetes environment since we are using the secret resource
> to
> > > > store
> > > > > the authentication information.
> > > > >
> > > > > But I have some questions for how to do it?
> > > > > 1. The 

[jira] [Created] (FLINK-21026) Align column list specification with Hive in INSERT statement

2021-01-19 Thread Zhenghua Gao (Jira)
Zhenghua Gao created FLINK-21026:


 Summary: Align column list specification with Hive in INSERT 
statement
 Key: FLINK-21026
 URL: https://issues.apache.org/jira/browse/FLINK-21026
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Reporter: Zhenghua Gao


[HIVE-9481|https://issues.apache.org/jira/browse/HIVE-9481] allows column list 
specification in INSERT statement. The syntax is:
{code:java}
INSERT INTO TABLE table_name 
[PARTITION (partcol1[=val1], partcol2[=val2] ...)] 
[(column list)] 
select_statement FROM from_statement
{code}
In the MeanWhile, flink introduces PARTITION syntax that the PARTITION clause 
appears after the COLUMN LIST clause. It looks weird and luckily we don't 
support COLUMN LIST clause 
now[FLINK-18726|https://issues.apache.org/jira/browse/FLINK-18726].  I think 
it'a good change to align this with Hive now.

 
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] Apache Flink 1.12.1 released

2021-01-19 Thread Till Rohrmann
Thanks a lot for driving this release Xintong. This was indeed a release
with some obstacles to overcome and you did it very well!

Cheers,
Till

On Tue, Jan 19, 2021 at 5:59 AM Xingbo Huang  wrote:

> Thanks Xintong for the great work!
>
> Best,
> Xingbo
>
> Peter Huang  于2021年1月19日周二 下午12:51写道:
>
> > Thanks for the great effort to make this happen. It paves us from using
> > 1.12 soon.
> >
> > Best Regards
> > Peter Huang
> >
> > On Mon, Jan 18, 2021 at 8:16 PM Yang Wang  wrote:
> >
> > > Thanks Xintong for the great work as our release manager!
> > >
> > >
> > > Best,
> > > Yang
> > >
> > > Xintong Song  于2021年1月19日周二 上午11:53写道:
> > >
> > >> The Apache Flink community is very happy to announce the release of
> > >> Apache Flink 1.12.1, which is the first bugfix release for the Apache
> > Flink
> > >> 1.12 series.
> > >>
> > >> Apache Flink® is an open-source stream processing framework for
> > >> distributed, high-performing, always-available, and accurate data
> > streaming
> > >> applications.
> > >>
> > >> The release is available for download at:
> > >> https://flink.apache.org/downloads.html
> > >>
> > >> Please check out the release blog post for an overview of the
> > >> improvements for this bugfix release:
> > >> https://flink.apache.org/news/2021/01/19/release-1.12.1.html
> > >>
> > >> The full release notes are available in Jira:
> > >>
> > >>
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12349459
> > >>
> > >> We would like to thank all contributors of the Apache Flink community
> > who
> > >> made this release possible!
> > >>
> > >> Regards,
> > >> Xintong
> > >>
> > >
> >
>


Re: [DISCUSS] Flink configuration from environment variables

2021-01-19 Thread Ingo Bürk
Hi Yang,

1. As you said I think this doesn't affect Ververica Platform, really, so
I'm more than happy to hear and follow the thoughts of people more
experienced with Flink than me.
2. I wasn't aware of env.java.opts, but that's definitely a candidate where
a user may want to "escape" it so it doesn't get substituted immediately, I
agree.


Regards
Ingo

On Tue, Jan 19, 2021 at 4:47 AM Yang Wang  wrote:

> Hi Ingo,
>
> Thanks for your response.
>
> 1. Not distinguishing JM/TM is reasonable, but what about the client side.
> For Yarn/K8s deployment,
> the local flink-conf.yaml will be shipped to JM/TM. So I am just confused
> about where should the environment
> variables be replaced? IIUC, it is not an issue for Ververica Platform
> since it is always done in the JM/TM side.
>
> 2. I believe we should support not do the substitution for specific key. A
> typical use case is "env.java.opts". If the
> value contains environment variables, they are expected to be replaced
> exactly when the java command is executed,
> not after the java process is started. Maybe escaping with single quote is
> enough.
>
> 3. The substitution only takes effects on the value makes sense to me.
>
>
> Best,
> Yang
>
> Steven Wu  于2021年1月19日周二 上午12:36写道:
>
> > Variable substitution (proposed here) is definitely useful.
> >
> > For us, hierarchical override is more useful.  E.g., we may have the
> > default value of "state.checkpoints.dir=path1" defined in
> flink-conf.yaml.
> > But maybe we want to override it to "state.checkpoints.dir=path2" via
> > environment variable in some scenarios. Otherwise, we have to define a
> > corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the Flink
> > config, which is annoying.
> >
> > As Ingo pointed, it is also annoying to handle Java property key naming
> > convention (dots separated), as dots aren't allowed in shell env var
> naming
> > (All caps, separated with underscore). Shell will complain. We have to
> > bundle all env var overrides (k-v pairs) in a single property value (JSON
> > and base64 encode) to avoid it.
> >
> > On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk  wrote:
> >
> > > Hi Yang,
> > >
> > > thanks for your questions! I'm glad to see this feature is being
> received
> > > positively.
> > >
> > > ad 1) We don't distinguish JM/TM, and I can't think of a good reason
> why
> > a
> > > user would want to do so. I'm not very experienced with Flink, however,
> > so
> > > please excuse me if I'm overlooking some obvious reason here. :-)
> > > ad 2) Admittedly I don't have a good overview on all the configuration
> > > options that exist, but from those that I do know I can't imagine
> someone
> > > wanting to pass a value like "${MY_VAR}" verbatim. In Ververica
> Platform
> > as
> > > of now we ignore this problem. If, however, this needs to be
> addressed, a
> > > possible solution could be to allow escaping syntax such as
> "\${MY_VAR}".
> > >
> > > Another point to consider here is when exactly the substitution takes
> > > place: on the "raw" file, or on the parsed key / value separately, and
> if
> > > so, should it support both key and value? My current thinking is that
> > > substituting only the value of the parsed entry should be sufficient.
> > >
> > >
> > > Regards
> > > Ingo
> > >
> > > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang 
> wrote:
> > >
> > > > Thanks for kicking off the discussion.
> > > >
> > > > I think supporting environment variables rendering in the Flink
> > > > configuration yaml file is a good idea. Especially for
> > > > the Kubernetes environment since we are using the secret resource to
> > > store
> > > > the authentication information.
> > > >
> > > > But I have some questions for how to do it?
> > > > 1. The environments in Flink configuration yaml will be replaced in
> > > client,
> > > > JobManager, TaskManager or all of them?
> > > > 2. If users do not want some config options to be replaced, how to
> > > > achieve that?
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > Khachatryan Roman  于2021年1月18日周一
> > 下午8:55写道:
> > > >
> > > > > Hi Ingo,
> > > > >
> > > > > Thanks a lot for this proposal!
> > > > >
> > > > > We had a related discussion recently in the context of FLINK-19520
> > > > > (randomizing tests configuration) [1].
> > > > > I believe other scenarios will benefit as well.
> > > > >
> > > > > For the end users, I think substitution in configuration files is
> > > > > preferable over parsing env vars in Flink code.
> > > > > And for cases without such a file, we could have a default one on
> the
> > > > > classpath with all substitutions defined (and then merge everything
> > > from
> > > > > the user-supplied file).
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-19520
> > > > >
> > > > > Regards,
> > > > > Roman
> > > > >
> > > > >
> > > > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk 
> > wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > in Ververica Platform we offer a feature to use 

[jira] [Created] (FLINK-21025) SQLClientHBaseITCase fails when untarring HBase

2021-01-19 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-21025:


 Summary: SQLClientHBaseITCase fails when untarring HBase
 Key: FLINK-21025
 URL: https://issues.apache.org/jira/browse/FLINK-21025
 Project: Flink
  Issue Type: Bug
  Components: Connectors / HBase, Table SQL / Client, Tests
Affects Versions: 1.13.0
Reporter: Dawid Wysakowicz
 Fix For: 1.13.0


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12210=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529

{code}
[ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 908.614 
s <<< FAILURE! - in org.apache.flink.tests.util.hbase.SQLClientHBaseITCase
Jan 19 08:19:36 [ERROR] testHBase[1: 
hbase-version:2.2.3](org.apache.flink.tests.util.hbase.SQLClientHBaseITCase)  
Time elapsed: 615.099 s  <<< ERROR!
Jan 19 08:19:36 java.io.IOException: 
Jan 19 08:19:36 Process execution failed due error. Error output:
Jan 19 08:19:36 at 
org.apache.flink.tests.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:133)
Jan 19 08:19:36 at 
org.apache.flink.tests.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:108)
Jan 19 08:19:36 at 
org.apache.flink.tests.util.AutoClosableProcess.runBlocking(AutoClosableProcess.java:70)
Jan 19 08:19:36 at 
org.apache.flink.tests.util.hbase.LocalStandaloneHBaseResource.setupHBaseDist(LocalStandaloneHBaseResource.java:86)
Jan 19 08:19:36 at 
org.apache.flink.tests.util.hbase.LocalStandaloneHBaseResource.before(LocalStandaloneHBaseResource.java:76)
Jan 19 08:19:36 at 
org.apache.flink.util.ExternalResource$1.evaluate(ExternalResource.java:46)
Jan 19 08:19:36 at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
Jan 19 08:19:36 at 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
Jan 19 08:19:36 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
Jan 19 08:19:36 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
Jan 19 08:19:36 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363)
Jan 19 08:19:36 at org.junit.runners.Suite.runChild(Suite.java:128)
Jan 19 08:19:36 at org.junit.runners.Suite.runChild(Suite.java:27)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
Jan 19 08:19:36 at 
org.apache.flink.util.ExternalResource$1.evaluate(ExternalResource.java:48)
Jan 19 08:19:36 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363)
Jan 19 08:19:36 at org.junit.runners.Suite.runChild(Suite.java:128)
Jan 19 08:19:36 at org.junit.runners.Suite.runChild(Suite.java:27)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
Jan 19 08:19:36 at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363)
Jan 19 08:19:36 at 
org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
Jan 19 08:19:36 at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
Jan 19 08:19:36 at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
Jan 19 08:19:36 at 

[jira] [Created] (FLINK-21024) Dynamic properties get exposed to job's main method if user parameters are passed

2021-01-19 Thread Matthias (Jira)
Matthias created FLINK-21024:


 Summary: Dynamic properties get exposed to job's main method if 
user parameters are passed
 Key: FLINK-21024
 URL: https://issues.apache.org/jira/browse/FLINK-21024
 Project: Flink
  Issue Type: Bug
Reporter: Matthias


A bug was identified in the [user 
ML|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Application-cluster-standalone-job-some-JVM-Options-added-to-Program-Arguments-td40719.html]
 by Alexey exposing dynamic properties into the job user code.

I was able to reproduce this issue by slightly adapting the WordCount example 
(see attached WordCount2.jar).

Initiating a standalone job without using the `--input` parameter will result 
in printing an empty array:
```
./bin/standalone-job.sh start --job-classname 
org.apache.flink.streaming.examples.wordcount.WordCount2
```
The corresponding `*.out` file looks like this:
```
[]
Executing WordCount2 example with default input data set.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.
```

In contrast, initiating the standalone job using the `--input` parameter will 
expose the dynamic properties:
```
./bin/standalone-job.sh start --job-classname 
org.apache.flink.streaming.examples.wordcount.WordCount2 --input 
/opt/flink/config/flink-conf.yaml
```
Resulting in the following output:
```
[--input, /opt/flink/config/flink-conf.yaml, -D, 
jobmanager.memory.off-heap.size=134217728b, -D, 
jobmanager.memory.jvm-overhead.min=201326592b, -D, 
jobmanager.memory.jvm-metaspace.size=268435456b, -D, 
jobmanager.memory.heap.size=1073741824b, -D, 
jobmanager.memory.jvm-overhead.max=201326592b]
Printing result to stdout. Use --output to specify output path.
```

Interestingly, this cannot be reproduced on a local standalone session cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Flink configuration from environment variables

2021-01-19 Thread Ingo Bürk
Hi Steven,

regarding the hierarchical override, we could even expand the substitution
solution to support shell syntax with default values like

state.checkpoints.dir: ${CHECKPOINTS_DIR:-path1}

such that if the environment variable doesn't exist, path1 will be used.


Regards
Ingo


On Mon, Jan 18, 2021 at 5:36 PM Steven Wu  wrote:

> Variable substitution (proposed here) is definitely useful.
>
> For us, hierarchical override is more useful.  E.g., we may have the
> default value of "state.checkpoints.dir=path1" defined in flink-conf.yaml.
> But maybe we want to override it to "state.checkpoints.dir=path2" via
> environment variable in some scenarios. Otherwise, we have to define a
> corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the Flink
> config, which is annoying.
>
> As Ingo pointed, it is also annoying to handle Java property key naming
> convention (dots separated), as dots aren't allowed in shell env var naming
> (All caps, separated with underscore). Shell will complain. We have to
> bundle all env var overrides (k-v pairs) in a single property value (JSON
> and base64 encode) to avoid it.
>
> On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk  wrote:
>
> > Hi Yang,
> >
> > thanks for your questions! I'm glad to see this feature is being received
> > positively.
> >
> > ad 1) We don't distinguish JM/TM, and I can't think of a good reason why
> a
> > user would want to do so. I'm not very experienced with Flink, however,
> so
> > please excuse me if I'm overlooking some obvious reason here. :-)
> > ad 2) Admittedly I don't have a good overview on all the configuration
> > options that exist, but from those that I do know I can't imagine someone
> > wanting to pass a value like "${MY_VAR}" verbatim. In Ververica Platform
> as
> > of now we ignore this problem. If, however, this needs to be addressed, a
> > possible solution could be to allow escaping syntax such as "\${MY_VAR}".
> >
> > Another point to consider here is when exactly the substitution takes
> > place: on the "raw" file, or on the parsed key / value separately, and if
> > so, should it support both key and value? My current thinking is that
> > substituting only the value of the parsed entry should be sufficient.
> >
> >
> > Regards
> > Ingo
> >
> > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang  wrote:
> >
> > > Thanks for kicking off the discussion.
> > >
> > > I think supporting environment variables rendering in the Flink
> > > configuration yaml file is a good idea. Especially for
> > > the Kubernetes environment since we are using the secret resource to
> > store
> > > the authentication information.
> > >
> > > But I have some questions for how to do it?
> > > 1. The environments in Flink configuration yaml will be replaced in
> > client,
> > > JobManager, TaskManager or all of them?
> > > 2. If users do not want some config options to be replaced, how to
> > > achieve that?
> > >
> > > Best,
> > > Yang
> > >
> > > Khachatryan Roman  于2021年1月18日周一
> 下午8:55写道:
> > >
> > > > Hi Ingo,
> > > >
> > > > Thanks a lot for this proposal!
> > > >
> > > > We had a related discussion recently in the context of FLINK-19520
> > > > (randomizing tests configuration) [1].
> > > > I believe other scenarios will benefit as well.
> > > >
> > > > For the end users, I think substitution in configuration files is
> > > > preferable over parsing env vars in Flink code.
> > > > And for cases without such a file, we could have a default one on the
> > > > classpath with all substitutions defined (and then merge everything
> > from
> > > > the user-supplied file).
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-19520
> > > >
> > > > Regards,
> > > > Roman
> > > >
> > > >
> > > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk 
> wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > in Ververica Platform we offer a feature to use environment
> variables
> > > in
> > > > > the Flink configuration¹, e.g.
> > > > >
> > > > > ```
> > > > > s3.access-key: ${S3_ACCESS_KEY}
> > > > > ```
> > > > >
> > > > > We've been discussing internally whether contributing such a
> feature
> > to
> > > > > Flink directly would make sense and wanted to start a discussion on
> > > this
> > > > > topic.
> > > > >
> > > > > An alternative way to do so from the above would be parsing those
> > > > directly
> > > > > based on their name, so instead of having it defined in the Flink
> > > > > configuration as above, it would get automatically set if something
> > > like
> > > > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment. This is
> > > somewhat
> > > > > similar to what e.g. Spring does, and faces similar challenges
> > (dealing
> > > > > with "."s etc.)
> > > > >
> > > > > Although I view both of these approaches as mostly orthogonal,
> > > supporting
> > > > > both very likely wouldn't make sense, of course. So I was wondering
> > > what
> > > > > your opinion is in terms of whether the project would benefit from
> > > > > environment variable 

[jira] [Created] (FLINK-21023) Task Manager uses the container dir of Job Manager when running flink job on yarn-cluster.

2021-01-19 Thread Tang Yan (Jira)
Tang Yan created FLINK-21023:


 Summary: Task Manager uses the container dir of Job Manager when 
running flink job on yarn-cluster.
 Key: FLINK-21023
 URL: https://issues.apache.org/jira/browse/FLINK-21023
 Project: Flink
  Issue Type: Bug
  Components: Client / Job Submission
Affects Versions: 1.11.1, 1.12.0
Reporter: Tang Yan


I want to try to use option  -yt(yarnship) to distribute my config files to 
Yarn cluster, and read the file in code. I just used the flink example 
wordcount.

Here is my submit command:

/opt/Flink/bin/flink run -m yarn-cluster -p 1 -yt /path/to/conf -c 
org.apache.flink.examples.java.wordcount.WordCount 
/opt/Flink/examples/batch/WordCount.jar --input conf/cmp_online.cfg

Test Result:

I found the if the job manager and task manager are lunched on the same node, 
the job can run successfully. But when they're running on different node, the 
job will fail in the below ERRORs. I find the conf folder has been distributed 
to container cache dirs, such as 
file:/data/d1/yarn/nm/usercache/yanta/appcache/application_1609125504851_3620/container_e283_1609125504851_3620_01_01/conf
 on job manager node, and 
file:/data/d1/yarn/nm/usercache/yanta/appcache/application_1609125504851_3620/container_e283_1609125504851_3620_01_02/conf
 on task manager node. But why the task manager loads the conf file from the 
container_eXXX_01 path (which is located on job manager node)?

_2021-01-19 04:19:11,405 INFO org.apache.flink.yarn.YarnResourceManager [] - 
Registering TaskManager with ResourceID 
container_e283_1609125504851_3620_01_02 
(akka.tcp://fl...@rphf1hsn026.qa.webex.com:46785/user/rpc/taskmanager_0) at 
ResourceManager 2021-01-19 04:19:11,506 INFO 
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - CHAIN DataSource 
(at main(WordCount.java:69) (org.apache.flink.api.java.io.TextInputFormat)) -> 
FlatMap (FlatMap at main(WordCount.java:84)) -> Combine (SUM(1), at 
main(WordCount.java:87) (1/1) (10cf614584617de770c6fd0f0aad4db7) switched from 
SCHEDULED to DEPLOYING. 2021-01-19 04:19:11,507 INFO 
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Deploying CHAIN 
DataSource (at main(WordCount.java:69) 
(org.apache.flink.api.java.io.TextInputFormat)) -> FlatMap (FlatMap at 
main(WordCount.java:84)) -> Combine (SUM(1), at main(WordCount.java:87) (1/1) 
(attempt #0) to container_e283_1609125504851_3620_01_02 @ 
rphf1hsn026.qa.webex.com (dataPort=46647) 2021-01-19 04:19:11,608 INFO 
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - CHAIN DataSource 
(at main(WordCount.java:69) (org.apache.flink.api.java.io.TextInputFormat)) -> 
FlatMap (FlatMap at main(WordCount.java:84)) -> Combine (SUM(1), at 
main(WordCount.java:87) (1/1) (10cf614584617de770c6fd0f0aad4db7) switched from 
DEPLOYING to RUNNING. 2021-01-19 04:19:11,792 INFO 
org.apache.flink.api.common.io.LocatableInputSplitAssigner [] - Assigning 
remote split to host rphf1hsn026 2021-01-19 04:19:11,847 INFO 
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - CHAIN DataSource 
(at main(WordCount.java:69) (org.apache.flink.api.java.io.TextInputFormat)) -> 
FlatMap (FlatMap at main(WordCount.java:84)) -> Combine (SUM(1), at 
main(WordCount.java:87) (1/1) (10cf614584617de770c6fd0f0aad4db7) switched from 
RUNNING to FAILED on 
org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot@3e19cc76. 
java.io.IOException: Error opening the Input Split 
file:/data/d1/yarn/nm/usercache/yanta/appcache/application_1609125504851_3620/container_e283_1609125504851_3620_01_01/conf/cmp_online.cfg
 [0,71]: 
/data/d1/yarn/nm/usercache/yanta/appcache/application_1609125504851_3620/container_e283_1609125504851_3620_01_01/conf/cmp_online.cfg
 (No such file or directory) at 
org.apache.flink.api.common.io.FileInputFormat.open(FileInputFormat.java:824) 
~[flink-dist_2.11-1.11.1.jar:1.11.1] at 
org.apache.flink.api.common.io.DelimitedInputFormat.open(DelimitedInputFormat.java:470)
 ~[flink-dist_2.11-1.11.1.jar:1.11.1] at 
org.apache.flink.api.common.io.DelimitedInputFormat.open(DelimitedInputFormat.java:47)
 ~[flink-dist_2.11-1.11.1.jar:1.11.1] at 
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:173)
 ~[flink-dist_2.11-1.11.1.jar:1.11.1] at 
org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) 
~[flink-dist_2.11-1.11.1.jar:1.11.1] at 
org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) 
~[flink-dist_2.11-1.11.1.jar:1.11.1] at java.lang.Thread.run(Thread.java:748) 
~[?:1.8.0_272] Caused by: java.io.FileNotFoundException: 
/data/d1/yarn/nm/usercache/yanta/appcache/application_1609125504851_3620/container_e283_1609125504851_3620_01_01/conf/cmp_online.cfg
 (No such file or directory) at java.io.FileInputStream.open0(Native Method) 
~[?:1.8.0_272] at java.io.FileInputStream.open(FileInputStream.java:195) 
~[?:1.8.0_272] at 

[DISCUSS] Correct time-related function behavior in Flink SQL

2021-01-19 Thread Leonard Xu
Hi, all

I want to start the discussion about correcting time-related function behavior 
in Flink SQL, this is a tricky topic but I think it’s time to address it. 

Currently some temporal function behaviors are wired to users.
1.  When users use a PROCTIME() in SQL, the value of PROCTIME() has a timezone 
offset with the wall-clock time in users' local time zone, users need to add 
their local time zone offset manually to get expected local timestamp(e.g: 
Users in Germany need to +1h to get expected local timestamp). 

2. Users can not use CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP  to get 
wall-clock timestamp in local time zone, and thus they need write UDF in their 
SQL just for implementing a simple filter like WHERE date_col =  CURRENT_DATE. 

3. Another common case  is the time window  with day interval based on 
PROCTIME(), user plan to put all data from one day into the same window, but 
the window is assigned using timestamp in UTC+0 timezone rather than the 
session timezone which leads to the window starts with an offset(e.g: Users in 
China need to add -8h in their business sql start and then +8h when output the 
result, the conversion like a magic for users). 

These problems come from that lots of time-related functions like PROCTIME(), 
NOW(), CURRENT_DATE, CURRENT_TIME and CURRENT_TIMESTAMP are returning time 
values based on UTC+0 time zone.

This topic will lead to a comparison of the three types, i.e. 
TIMESTAMP/TIMESTAMP WITHOUT TIME ZONE, TIMESTAMP WITH LOCAL TIME ZONE and 
TIMESTAMP WITH TIME ZONE. In order to better understand the three types, I 
wrote a document[1] to help understand them better. You can also know the tree 
timestamp types behavior in Hadoop ecosystem from the reference link int the 
doc.


I Invested all Flink time-related functions current behavior and compared with 
other DB vendors like Pg,Presto, Hive, Spark, Snowflake,  I made an excel [2] 
to organize them well, we can use it for the next discussion. Please let me 
know if I missed something.
From my investigation, I think we need to correct the behavior of function 
NOW()/PROCTIME()/CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP, to correct them, 
we can change the function return type or function return value or change 
return type and return value both. All of those way are valid because SQL:2011 
does not specify the function return type and every SQL engine vendor has its 
own implementation. For example the CURRENT_TIMESTAMP function,

FLINK   current behaviorexisted problem other vendors' behavior 
proposed change
CURRENT_TIMESTAMP   CURRENT_TIMESTAMP
TIMESTAMP(0) NOT NULL

#session timezone: UTC
2020-12-28T23:52:52

#session timezone: UTC+8
2020-12-28T23:52:52

wall clock:
UTC+8: 2020-12-29 07:52:52  Wrong value:returns UTC timestamp, but user 
expects current timestamp in session time zone  In MySQL, Spark, the 
function NOW() and CURRENT_TIMESTAMP return current timestamp value in session 
time zone,the return type is TIMESTAMP

In Pg, Presto, the function NOW() and LOCALTIMESTAMP return current timestamp 
in session time zone,the return type is TIMESTAMP WITH TIME ZONE

In Snowflake, the function CURRENT_TIMESTAMP / LOCALTIMESTAMP return current 
timestamp in session time zone,the return type is TIMESTAMP WITH LOCAL TIME 
ZONE   Flink should return current timestamp in session time zone, the return 
type should be TIMESTAMP


I tend to only change the return value for these problematic functions and 
introduce an option for compatibility consideration, what do you think?


Looking forward to your feedback.

Best,
Leonard

[1] 
https://docs.google.com/document/d/1iY3eatV8LBjmF0gWh2JYrQR0FlTadsSeuCsksOVp_iA/edit?usp=sharing
 

 
[2] 
https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing
 

 

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-19 Thread Xintong Song
Thanks for the feedback, Stephan.

Actually, your proposal has also come to my mind at some point. And I have
some concerns about it.


1. It does not give users the same control as the SSG-based approach.


While both approaches do not require specifying for each operator,
SSG-based approach supports the semantic that "some operators together use
this much resource" while the operator-based approach doesn't.


Think of a long pipeline with m operators (o_1, o_2, ..., o_m), and at some
point there's an agg o_n (1 < n < m) which significantly reduces the data
amount. One can separate the pipeline into 2 groups SSG_1 (o_1, ..., o_n)
and SSG_2 (o_n+1, ... o_m), so that configuring much higher parallelisms
for operators in SSG_1 than for operators in SSG_2 won't lead to too much
wasting of resources. If the two SSGs end up needing different resources,
with the SSG-based approach one can directly specify resources for the two
groups. However, with the operator-based approach, the user will have to
specify resources for each operator in one of the two groups, and tune the
default slot resource via configurations to fit the other group.


2. It increases the chance of breaking operator chains.


Setting chainnable operators into different slot sharing groups will
prevent them from being chained. In the current implementation, downstream
operators, if SSG not explicitly specified, will be set to the same group
as the chainable upstream operators (unless multiple upstream operators in
different groups), to reduce the chance of breaking chains.


Thinking of chainable operators o_1 -> o_2 -> o_3 -> o_3, deciding SSGs
based on whether resource is specified we will easily get groups like (o_1,
o_3) & (o_2, o_4), where none of the operators can be chained. This is also
possible for the SSG-based approach, but I believe the chance is much
smaller because there's no strong reason for users to specify the groups
with alternate operators like that. We are more likely to get groups like
(o_1, o_2) & (o_3, o_4), where the chain breaks only between o_2 and o_3.


3. It complicates the system by having two different mechanisms for sharing
managed memory in  a slot.


- In FLIP-141, we introduced the intra-slot managed memory sharing
mechanism, where managed memory is first distributed according to the
consumer type, then further distributed across operators of that consumer
type.

- With the operator-based approach, managed memory size specified for an
operator should account for all the consumer types of that operator. That
means the managed memory is first distributed across operators, then
distributed to different consumer types of each operator.


Unfortunately, the different order of the two calculation steps can lead to
different results. To be specific, the semantic of the configuration option
`consumer-weights` changed (within a slot vs. within an operator).



To sum up things:

While (3) might be a bit more implementation related, I think (1) and (2)
somehow suggest that, the price for the proposed approach to avoid
specifying resource for every operator is that it's not as independent from
operator chaining and slot sharing as the operator-based approach discussed
in the FLIP.


Thank you~

Xintong Song



On Tue, Jan 19, 2021 at 4:29 AM Stephan Ewen  wrote:

> Thanks a lot, Yangze and Xintong for this FLIP.
>
> I want to say, first of all, that this is super well written. And the
> points that the FLIP makes about how to expose the configuration to users
> is exactly the right thing to figure out first.
> So good job here!
>
> About how to let users specify the resource profiles. If I can sum the FLIP
> and previous discussion up in my own words, the problem is the following:
>
> Operator-level specification is the simplest and cleanest approach, because
> > it avoids mixing operator configuration (resource) and scheduling. No
> > matter what other parameters change (chaining, slot sharing, switching
> > pipelined and blocking shuffles), the resource profiles stay the same.
> > But it would require that a user specifies resources on all operators,
> > which makes it hard to use. That's why the FLIP suggests going with
> > specifying resources on a Sharing-Group.
>
>
> I think both thoughts are important, so can we find a solution where the
> Resource Profiles are specified on an Operator, but we still avoid that we
> need to specify a resource profile on every operator?
>
> What do you think about something like the following:
>   - Resource Profiles are specified on an operator level.
>   - Not all operators need profiles
>   - All Operators without a Resource Profile ended up in the default slot
> sharing group with a default profile (will get a default slot).
>   - All Operators with a Resource Profile will go into another slot sharing
> group (the resource-specified-group).
>   - Users can define different slot sharing groups for operators like they
> do now, with the exception that you cannot mix operators