Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Gary Yao
Congratulations Andrey, well deserved!

Best,
Gary

On Thu, Aug 15, 2019 at 7:50 AM Bowen Li  wrote:

> Congratulations Andrey!
>
> On Wed, Aug 14, 2019 at 10:18 PM Rong Rong  wrote:
>
>> Congratulations Andrey!
>>
>> On Wed, Aug 14, 2019 at 10:14 PM chaojianok  wrote:
>>
>> > Congratulations Andrey!
>> > At 2019-08-14 21:26:37, "Till Rohrmann"  wrote:
>> > >Hi everyone,
>> > >
>> > >I'm very happy to announce that Andrey Zagrebin accepted the offer of
>> the
>> > >Flink PMC to become a committer of the Flink project.
>> > >
>> > >Andrey has been an active community member for more than 15 months. He
>> has
>> > >helped shaping numerous features such as State TTL, FRocksDB release,
>> > >Shuffle service abstraction, FLIP-1, result partition management and
>> > >various fixes/improvements. He's also frequently helping out on the
>> > >user@f.a.o mailing lists.
>> > >
>> > >Congratulations Andrey!
>> > >
>> > >Best, Till
>> > >(on behalf of the Flink PMC)
>> >
>>
>


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Bowen Li
Congratulations Andrey!

On Wed, Aug 14, 2019 at 10:18 PM Rong Rong  wrote:

> Congratulations Andrey!
>
> On Wed, Aug 14, 2019 at 10:14 PM chaojianok  wrote:
>
> > Congratulations Andrey!
> > At 2019-08-14 21:26:37, "Till Rohrmann"  wrote:
> > >Hi everyone,
> > >
> > >I'm very happy to announce that Andrey Zagrebin accepted the offer of
> the
> > >Flink PMC to become a committer of the Flink project.
> > >
> > >Andrey has been an active community member for more than 15 months. He
> has
> > >helped shaping numerous features such as State TTL, FRocksDB release,
> > >Shuffle service abstraction, FLIP-1, result partition management and
> > >various fixes/improvements. He's also frequently helping out on the
> > >user@f.a.o mailing lists.
> > >
> > >Congratulations Andrey!
> > >
> > >Best, Till
> > >(on behalf of the Flink PMC)
> >
>


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Rong Rong
Congratulations Andrey!

On Wed, Aug 14, 2019 at 10:14 PM chaojianok  wrote:

> Congratulations Andrey!
> At 2019-08-14 21:26:37, "Till Rohrmann"  wrote:
> >Hi everyone,
> >
> >I'm very happy to announce that Andrey Zagrebin accepted the offer of the
> >Flink PMC to become a committer of the Flink project.
> >
> >Andrey has been an active community member for more than 15 months. He has
> >helped shaping numerous features such as State TTL, FRocksDB release,
> >Shuffle service abstraction, FLIP-1, result partition management and
> >various fixes/improvements. He's also frequently helping out on the
> >user@f.a.o mailing lists.
> >
> >Congratulations Andrey!
> >
> >Best, Till
> >(on behalf of the Flink PMC)
>


Re:[ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread chaojianok
Congratulations Andrey!

At 2019-08-14 21:26:37, "Till Rohrmann"  wrote:

Hi everyone,


I'm very happy to announce that Andrey Zagrebin accepted the offer of the Flink 
PMC to become a committer of the Flink project.


Andrey has been an active community member for more than 15 months. He has 
helped shaping numerous features such as State TTL, FRocksDB release, Shuffle 
service abstraction, FLIP-1, result partition management and various 
fixes/improvements. He's also frequently helping out on the user@f.a.o mailing 
lists.


Congratulations Andrey!


Best, Till 

(on behalf of the Flink PMC)

Re:[ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread chaojianok
Congratulations Andrey!
At 2019-08-14 21:26:37, "Till Rohrmann"  wrote:
>Hi everyone,
>
>I'm very happy to announce that Andrey Zagrebin accepted the offer of the
>Flink PMC to become a committer of the Flink project.
>
>Andrey has been an active community member for more than 15 months. He has
>helped shaping numerous features such as State TTL, FRocksDB release,
>Shuffle service abstraction, FLIP-1, result partition management and
>various fixes/improvements. He's also frequently helping out on the
>user@f.a.o mailing lists.
>
>Congratulations Andrey!
>
>Best, Till
>(on behalf of the Flink PMC)


Re:Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread chaojianok
Congratulations Andrey!
At 2019-08-15 10:02:49, "Jark Wu"  wrote:
>Congratulations Andrey!
>
>
>Cheers,
>Jark
>
>On Thu, 15 Aug 2019 at 00:57, jincheng sun  wrote:
>
>> Congrats Andrey! Very happy to have you onboard :)
>>
>> Best, Jincheng
>>
>> Yu Li  于2019年8月15日周四 上午12:06写道:
>>
>> > Congratulations Andrey! Well deserved!
>> >
>> > Best Regards,
>> > Yu
>> >
>> >
>> > On Wed, 14 Aug 2019 at 17:55, Aleksey Pak  wrote:
>> >
>> > > Congratulations, Andrey!
>> > >
>> > > On Wed, Aug 14, 2019 at 4:53 PM Markos Sfikas 
>> > > wrote:
>> > >
>> > > > Congrats Andrey!
>> > > >
>> > > > On Wed, 14 Aug 2019 at 16:47, Becket Qin 
>> wrote:
>> > > >
>> > > > > Congratulations, Andrey!
>> > > > >
>> > > > > On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise 
>> wrote:
>> > > > >
>> > > > > > Congrats!
>> > > > > >
>> > > > > >
>> > > > > > On Wed, Aug 14, 2019, 7:12 AM Robert Metzger <
>> rmetz...@apache.org>
>> > > > > wrote:
>> > > > > >
>> > > > > > > Congratulations! Very happy to have you onboard :)
>> > > > > > >
>> > > > > > > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas <
>> > kklou...@gmail.com
>> > > >
>> > > > > > wrote:
>> > > > > > >
>> > > > > > > > Congratulations Andrey!
>> > > > > > > > Well deserved!
>> > > > > > > >
>> > > > > > > > Kostas
>> > > > > > > >
>> > > > > > > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang 
>> > > wrote:
>> > > > > > > > >
>> > > > > > > > > Congratulations Andrey.
>> > > > > > > > >
>> > > > > > > > > Best
>> > > > > > > > > Yun Tang
>> > > > > > > > > 
>> > > > > > > > > From: Xintong Song 
>> > > > > > > > > Sent: Wednesday, August 14, 2019 21:40
>> > > > > > > > > To: Oytun Tez 
>> > > > > > > > > Cc: Zili Chen ; Till Rohrmann <
>> > > > > > > > trohrm...@apache.org>; dev ; user <
>> > > > > > > > u...@flink.apache.org>
>> > > > > > > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink
>> > > committer
>> > > > > > > > >
>> > > > > > > > > Congratulations Andery~!
>> > > > > > > > >
>> > > > > > > > > Thank you~
>> > > > > > > > >
>> > > > > > > > > Xintong Song
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez <
>> > oy...@motaword.com>
>> > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > Congratulations Andrey!
>> > > > > > > > >
>> > > > > > > > > I am glad the Flink committer team is growing at such a
>> pace!
>> > > > > > > > >
>> > > > > > > > > ---
>> > > > > > > > > Oytun Tez
>> > > > > > > > >
>> > > > > > > > > M O T A W O R D
>> > > > > > > > > The World's Fastest Human Translation Platform.
>> > > > > > > > > oy...@motaword.com — www.motaword.com
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen <
>> > > wander4...@gmail.com>
>> > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > Congratulations Andrey!
>> > > > > > > > >
>> > > > > > > > > Best,
>> > > > > > > > > tison.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Till Rohrmann  于2019年8月14日周三
>> 下午9:26写道:
>> > > > > > > > >
>> > > > > > > > > Hi everyone,
>> > > > > > > > >
>> > > > > > > > > I'm very happy to announce that Andrey Zagrebin accepted
>> the
>> > > > offer
>> > > > > of
>> > > > > > > > the Flink PMC to become a committer of the Flink project.
>> > > > > > > > >
>> > > > > > > > > Andrey has been an active community member for more than 15
>> > > > months.
>> > > > > > He
>> > > > > > > > has helped shaping numerous features such as State TTL,
>> > FRocksDB
>> > > > > > release,
>> > > > > > > > Shuffle service abstraction, FLIP-1, result partition
>> > management
>> > > > and
>> > > > > > > > various fixes/improvements. He's also frequently helping out
>> on
>> > > the
>> > > > > > > > user@f.a.o mailing lists.
>> > > > > > > > >
>> > > > > > > > > Congratulations Andrey!
>> > > > > > > > >
>> > > > > > > > > Best, Till
>> > > > > > > > > (on behalf of the Flink PMC)
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > > Markos Sfikas
>> > > > +49 (0) 15759630002
>> > > >
>> > >
>> >
>>


[jira] [Created] (FLINK-13732) Enhance JobManagerMetricGroup with FLIP-6 architecture

2019-08-14 Thread Biao Liu (JIRA)
Biao Liu created FLINK-13732:


 Summary: Enhance JobManagerMetricGroup with FLIP-6 architecture
 Key: FLINK-13732
 URL: https://issues.apache.org/jira/browse/FLINK-13732
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
Reporter: Biao Liu
 Fix For: 1.10.0


This is a requirement from user mailing list [1]. I think it's reasonable 
enough to support.

The scenario is that when deploying a Flink cluster on Yarn, there might be 
several {{JM(RM)}} s running on the same host. IMO that's quite a general 
scenario. However we can't distinguish the metrics from different 
{{JobManagerMetricGroup}}, because there is only one variable "hostname" we can 
use.

I think there are some problems of current implementation of 
{{JobManagerMetricGroup}}. It's still non-FLIP-6 style. We should split the 
metric group into {{RM}} and {{Dispatcher}} to match the FLIP-6 architecture. 
And there should be an identification variable supported, just like {{tm_id}}.

CC [~StephanEwen], [~till.rohrmann], [~Zentol]

1. 
[http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-metrics-scope-for-YARN-single-job-td29389.html]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread leesf
Congratulations Andrey!

Best,
Leesf

Zhu Zhu  于2019年8月15日周四 上午11:20写道:

> Congratulations Andrey!
>
> Thanks,
> Zhu Zhu
>
> vino yang  于2019年8月15日周四 上午11:05写道:
>
> > Congratulations Andrey!
> >
> > Best,
> > Vino
> >
> > Yun Gao  于2019年8月15日周四 上午10:49写道:
> >
> > > Congratulations Andrey!
> > >
> > > Best,
> > > Yun
> > >
> > >
> > > --
> > > From:Congxian Qiu 
> > > Send Time:2019 Aug. 15 (Thu.) 10:28
> > > To:dev@flink.apache.org 
> > > Subject:Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer
> > >
> > > Congratulations Andery!
> > > Best,
> > > Congxian
> > >
> > >
> > > Kurt Young  于2019年8月15日周四 上午10:12写道:
> > >
> > > > Congratulations Andery!
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > > >
> > > > On Thu, Aug 15, 2019 at 10:09 AM Biao Liu 
> wrote:
> > > >
> > > > > Congrats!
> > > > >
> > > > > Thanks,
> > > > > Biao /'bɪ.aʊ/
> > > > >
> > > > >
> > > > >
> > > > > On Thu, 15 Aug 2019 at 10:03, Jark Wu  wrote:
> > > > >
> > > > > > Congratulations Andrey!
> > > > > >
> > > > > >
> > > > > > Cheers,
> > > > > > Jark
> > > > > >
> > > > > > On Thu, 15 Aug 2019 at 00:57, jincheng sun <
> > sunjincheng...@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Congrats Andrey! Very happy to have you onboard :)
> > > > > > >
> > > > > > > Best, Jincheng
> > > > > > >
> > > > > > > Yu Li  于2019年8月15日周四 上午12:06写道:
> > > > > > >
> > > > > > > > Congratulations Andrey! Well deserved!
> > > > > > > >
> > > > > > > > Best Regards,
> > > > > > > > Yu
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, 14 Aug 2019 at 17:55, Aleksey Pak <
> > alek...@ververica.com
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Congratulations, Andrey!
> > > > > > > > >
> > > > > > > > > On Wed, Aug 14, 2019 at 4:53 PM Markos Sfikas <
> > > > > mark.sfi...@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Congrats Andrey!
> > > > > > > > > >
> > > > > > > > > > On Wed, 14 Aug 2019 at 16:47, Becket Qin <
> > > becket@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Congratulations, Andrey!
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise <
> > > t...@apache.org
> > > > >
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Congrats!
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Aug 14, 2019, 7:12 AM Robert Metzger <
> > > > > > > rmetz...@apache.org>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Congratulations! Very happy to have you onboard :)
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas <
> > > > > > > > kklou...@gmail.com
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > > > > Well deserved!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Kostas
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang <
> > > > > myas...@live.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Congratulations Andrey.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best
> > > > > > > > > > > > > > > Yun Tang
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > From: Xintong Song 
> > > > > > > > > > > > > > > Sent: Wednesday, August 14, 2019 21:40
> > > > > > > > > > > > > > > To: Oytun Tez 
> > > > > > > > > > > > > > > Cc: Zili Chen ; Till
> > > Rohrmann
> > > > <
> > > > > > > > > > > > > > trohrm...@apache.org>; dev  >;
> > > > user
> > > > > <
> > > > > > > > > > > > > > u...@flink.apache.org>
> > > > > > > > > > > > > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin
> becomes a
> > > > Flink
> > > > > > > > > committer
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Congratulations Andery~!
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez <
> > > > > > > > oy...@motaword.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I am glad the Flink committer team is growing
> at
> > > > such a
> > > > > > > pace!
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > > Oytun Tez
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > M O T A W O R D
> > > > > > > > > > > > > > > The World's Fastest Human Translation Platform.
> > > > > > > > > > > > > > > oy...@motaword.com — www.motaword.com
> > > > > > > > > > > > > > >
> > > > > 

[jira] [Created] (FLINK-13731) flink sql support window with alignment

2019-08-14 Thread zhaoshijie (JIRA)
zhaoshijie created FLINK-13731:
--

 Summary: flink sql support window with alignment
 Key: FLINK-13731
 URL: https://issues.apache.org/jira/browse/FLINK-13731
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Affects Versions: 1.9.0
Reporter: zhaoshijie


for now, sql: 
{code:java}
// code placeholder
SELECT  COUNT(*) GROUP BY TUMBLE(pt, interval '1' DAY, time '00:08:00')
{code}
not supported in flink sql, when rowtime is processTime, the window is  
assigned by UTC time,  it is not correct day window when i was in specified 
time zone.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [DISCUSS] FLIP-52: Remove legacy Program interface.

2019-08-14 Thread SHI Xiaogang
+1

Glad that programming with flink becomes simpler and easier.

Regards,
Xiaogang

Aljoscha Krettek  于2019年8月14日周三 下午11:31写道:

> +1 (for the same reasons I posted on the other thread)
>
> > On 14. Aug 2019, at 15:03, Zili Chen  wrote:
> >
> > +1
> >
> > It could be regarded as part of Flink client api refactor.
> > Removal of stale code paths helps reason refactor.
> >
> > There is one thing worth attention that in this thread[1] Thomas
> > suggests an interface with a method return JobGraph based on the
> > fact that REST API and in per job mode actually extracts the JobGraph
> > from user program and submit it instead of running user program and
> > submission happens inside the program in session scenario.
> >
> > Such an interface would be like
> >
> > interface Program {
> >  JobGraph getJobGraph(args);
> > }
> >
> > Anyway, the discussion above could be continued in that thread.
> > Current Program is a legacy class that quite less useful than it should
> be.
> >
> > Best,
> > tison.
> >
> > [1]
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/REST-API-JarRunHandler-More-flexibility-for-launching-jobs-td31026.html#a31168
> >
> >
> > Stephan Ewen  于2019年8月14日周三 下午7:50写道:
> >
> >> +1
> >>
> >> the "main" method is the overwhelming default. getting rid of "two ways
> to
> >> do things" is a good idea.
> >>
> >> On Wed, Aug 14, 2019 at 1:42 PM Kostas Kloudas 
> wrote:
> >>
> >>> Hi all,
> >>>
> >>> As discussed in [1] , the Program interface seems to be outdated and
> >>> there seems to be
> >>> no objection to remove it.
> >>>
> >>> Given that this interface is PublicEvolving, its removal should pass
> >>> through a FLIP and
> >>> this discussion and the associated FLIP are exactly for that purpose.
> >>>
> >>> Please let me know what you think and if it is ok to proceed with its
> >>> removal.
> >>>
> >>> Cheers,
> >>> Kostas
> >>>
> >>> link to FLIP-52 :
> >>>
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=125308637
> >>>
> >>> [1]
> >>>
> >>
> https://lists.apache.org/x/thread.html/7ffc9936a384b891dbcf0a481d26c6d13b2125607c200577780d1e18@%3Cdev.flink.apache.org%3E
> >>>
> >>
>
>


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Zhu Zhu
Congratulations Andrey!

Thanks,
Zhu Zhu

vino yang  于2019年8月15日周四 上午11:05写道:

> Congratulations Andrey!
>
> Best,
> Vino
>
> Yun Gao  于2019年8月15日周四 上午10:49写道:
>
> > Congratulations Andrey!
> >
> > Best,
> > Yun
> >
> >
> > --
> > From:Congxian Qiu 
> > Send Time:2019 Aug. 15 (Thu.) 10:28
> > To:dev@flink.apache.org 
> > Subject:Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer
> >
> > Congratulations Andery!
> > Best,
> > Congxian
> >
> >
> > Kurt Young  于2019年8月15日周四 上午10:12写道:
> >
> > > Congratulations Andery!
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Thu, Aug 15, 2019 at 10:09 AM Biao Liu  wrote:
> > >
> > > > Congrats!
> > > >
> > > > Thanks,
> > > > Biao /'bɪ.aʊ/
> > > >
> > > >
> > > >
> > > > On Thu, 15 Aug 2019 at 10:03, Jark Wu  wrote:
> > > >
> > > > > Congratulations Andrey!
> > > > >
> > > > >
> > > > > Cheers,
> > > > > Jark
> > > > >
> > > > > On Thu, 15 Aug 2019 at 00:57, jincheng sun <
> sunjincheng...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Congrats Andrey! Very happy to have you onboard :)
> > > > > >
> > > > > > Best, Jincheng
> > > > > >
> > > > > > Yu Li  于2019年8月15日周四 上午12:06写道:
> > > > > >
> > > > > > > Congratulations Andrey! Well deserved!
> > > > > > >
> > > > > > > Best Regards,
> > > > > > > Yu
> > > > > > >
> > > > > > >
> > > > > > > On Wed, 14 Aug 2019 at 17:55, Aleksey Pak <
> alek...@ververica.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > Congratulations, Andrey!
> > > > > > > >
> > > > > > > > On Wed, Aug 14, 2019 at 4:53 PM Markos Sfikas <
> > > > mark.sfi...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Congrats Andrey!
> > > > > > > > >
> > > > > > > > > On Wed, 14 Aug 2019 at 16:47, Becket Qin <
> > becket@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Congratulations, Andrey!
> > > > > > > > > >
> > > > > > > > > > On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise <
> > t...@apache.org
> > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Congrats!
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Aug 14, 2019, 7:12 AM Robert Metzger <
> > > > > > rmetz...@apache.org>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Congratulations! Very happy to have you onboard :)
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas <
> > > > > > > kklou...@gmail.com
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > > > Well deserved!
> > > > > > > > > > > > >
> > > > > > > > > > > > > Kostas
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang <
> > > > myas...@live.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Congratulations Andrey.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best
> > > > > > > > > > > > > > Yun Tang
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > From: Xintong Song 
> > > > > > > > > > > > > > Sent: Wednesday, August 14, 2019 21:40
> > > > > > > > > > > > > > To: Oytun Tez 
> > > > > > > > > > > > > > Cc: Zili Chen ; Till
> > Rohrmann
> > > <
> > > > > > > > > > > > > trohrm...@apache.org>; dev ;
> > > user
> > > > <
> > > > > > > > > > > > > u...@flink.apache.org>
> > > > > > > > > > > > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a
> > > Flink
> > > > > > > > committer
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Congratulations Andery~!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez <
> > > > > > > oy...@motaword.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I am glad the Flink committer team is growing at
> > > such a
> > > > > > pace!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > Oytun Tez
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > M O T A W O R D
> > > > > > > > > > > > > > The World's Fastest Human Translation Platform.
> > > > > > > > > > > > > > oy...@motaword.com — www.motaword.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen <
> > > > > > > > wander4...@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > tison.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > 

Re: [DISCUSS] FLIP-51: Rework of the Expression Design

2019-08-14 Thread Jark Wu
Thanks Jingsong for starting the discussion.

The general design of the FLIP looks good to me. +1 for the FLIP. It's time
to get rid of the old Expression!

Regarding to the function behavior, shall we also include new functions
from blink planner (e.g. LISTAGG, REGEXP, TO_DATE, etc..) ?


Best,
Jark





On Wed, 14 Aug 2019 at 23:34, Timo Walther  wrote:

> Hi Jingsong,
>
> thanks for writing down this FLIP. Big +1 from my side to finally get
> rid of PlannerExpressions and have consistent and well-defined behavior
> for Table API and SQL updated to FLIP-37.
>
> We might need to discuss some of the behavior of particular functions
> but this should not affect the actual FLIP-51.
>
> Regards,
> Timo
>
>
> Am 13.08.19 um 12:55 schrieb JingsongLee:
> > Hi everyone,
> >
> > We would like to start a discussion thread on "FLIP-51: Rework of the
> > Expression Design"(Design doc: [1], FLIP: [2]), where we describe how
> >   to improve the new java Expressions to work with type inference and
> >   convert expression to the calcite RexNode. This is a follow-up plan
> > for FLIP-32[3] and FLIP-37[4]. This FLIP is mostly based on FLIP-37.
> >
> > This FLIP addresses several shortcomings of current:
> > - New Expressions still use PlannerExpressions to type inference and
> >   to RexNode. Flnk-planner and blink-planner have a lot of repetitive
> code
> >   and logic.
> > - Let TableApi and Cacite definitions consistent.
> > - Reduce the complexity of Function development.
> > - Powerful Function for user.
> >
> > Key changes can be summarized as follows:
> > - Improve the interface of FunctionDefinition.
> > - Introduce type inference for built-in functions.
> > - Introduce ExpressionConverter to convert Expression to calcite
> >   RexNode.
> > - Remove repetitive code and logic in planners.
> >
> > I also listed type inference and behavior of all built-in functions [5],
> to
> > verify that the interface is satisfied. After introduce type inference to
> > table-common module, planners should have a unified function behavior.
> > And this gives the community also the chance to quickly discuss types
> >   and behavior of functions a last time before they are declared stable.
> >
> > Looking forward to your feedbacks. Thank you.
> >
> > [1]
> https://docs.google.com/document/d/1yFDyquMo_-VZ59vyhaMshpPtg7p87b9IYdAtMXv5XmM/edit?usp=sharing
> > [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-51%3A+Rework+of+the+Expression+Design
> > [3]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-32%3A+Restructure+flink-table+for+future+contributions
> > [4]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System
> > [5]
> https://docs.google.com/document/d/1fyVmdGgbO1XmIyQ1BaoG_h5BcNcF3q9UJ1Bj1euO240/edit?usp=sharing
> >
> > Best,
> > Jingsong Lee
>
>
>


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread vino yang
Congratulations Andrey!

Best,
Vino

Yun Gao  于2019年8月15日周四 上午10:49写道:

> Congratulations Andrey!
>
> Best,
> Yun
>
>
> --
> From:Congxian Qiu 
> Send Time:2019 Aug. 15 (Thu.) 10:28
> To:dev@flink.apache.org 
> Subject:Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer
>
> Congratulations Andery!
> Best,
> Congxian
>
>
> Kurt Young  于2019年8月15日周四 上午10:12写道:
>
> > Congratulations Andery!
> >
> > Best,
> > Kurt
> >
> >
> > On Thu, Aug 15, 2019 at 10:09 AM Biao Liu  wrote:
> >
> > > Congrats!
> > >
> > > Thanks,
> > > Biao /'bɪ.aʊ/
> > >
> > >
> > >
> > > On Thu, 15 Aug 2019 at 10:03, Jark Wu  wrote:
> > >
> > > > Congratulations Andrey!
> > > >
> > > >
> > > > Cheers,
> > > > Jark
> > > >
> > > > On Thu, 15 Aug 2019 at 00:57, jincheng sun  >
> > > > wrote:
> > > >
> > > > > Congrats Andrey! Very happy to have you onboard :)
> > > > >
> > > > > Best, Jincheng
> > > > >
> > > > > Yu Li  于2019年8月15日周四 上午12:06写道:
> > > > >
> > > > > > Congratulations Andrey! Well deserved!
> > > > > >
> > > > > > Best Regards,
> > > > > > Yu
> > > > > >
> > > > > >
> > > > > > On Wed, 14 Aug 2019 at 17:55, Aleksey Pak  >
> > > > wrote:
> > > > > >
> > > > > > > Congratulations, Andrey!
> > > > > > >
> > > > > > > On Wed, Aug 14, 2019 at 4:53 PM Markos Sfikas <
> > > mark.sfi...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Congrats Andrey!
> > > > > > > >
> > > > > > > > On Wed, 14 Aug 2019 at 16:47, Becket Qin <
> becket@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Congratulations, Andrey!
> > > > > > > > >
> > > > > > > > > On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise <
> t...@apache.org
> > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Congrats!
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Aug 14, 2019, 7:12 AM Robert Metzger <
> > > > > rmetz...@apache.org>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Congratulations! Very happy to have you onboard :)
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas <
> > > > > > kklou...@gmail.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > > Well deserved!
> > > > > > > > > > > >
> > > > > > > > > > > > Kostas
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang <
> > > myas...@live.com
> > > > >
> > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Congratulations Andrey.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best
> > > > > > > > > > > > > Yun Tang
> > > > > > > > > > > > > 
> > > > > > > > > > > > > From: Xintong Song 
> > > > > > > > > > > > > Sent: Wednesday, August 14, 2019 21:40
> > > > > > > > > > > > > To: Oytun Tez 
> > > > > > > > > > > > > Cc: Zili Chen ; Till
> Rohrmann
> > <
> > > > > > > > > > > > trohrm...@apache.org>; dev ;
> > user
> > > <
> > > > > > > > > > > > u...@flink.apache.org>
> > > > > > > > > > > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a
> > Flink
> > > > > > > committer
> > > > > > > > > > > > >
> > > > > > > > > > > > > Congratulations Andery~!
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > >
> > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez <
> > > > > > oy...@motaword.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > > >
> > > > > > > > > > > > > I am glad the Flink committer team is growing at
> > such a
> > > > > pace!
> > > > > > > > > > > > >
> > > > > > > > > > > > > ---
> > > > > > > > > > > > > Oytun Tez
> > > > > > > > > > > > >
> > > > > > > > > > > > > M O T A W O R D
> > > > > > > > > > > > > The World's Fastest Human Translation Platform.
> > > > > > > > > > > > > oy...@motaword.com — www.motaword.com
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen <
> > > > > > > wander4...@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > tison.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Till Rohrmann  于2019年8月14日周三
> > > > > 下午9:26写道:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'm very happy to announce that Andrey Zagrebin
> > > accepted
> > > > > the
> > > > > > > > offer
> > > > > > > > > of
> > > > > > > > > > > > the Flink PMC to become a committer of the Flink
> > project.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Andrey has been an active community 

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Yun Gao
Congratulations Andrey!

Best,
Yun


--
From:Congxian Qiu 
Send Time:2019 Aug. 15 (Thu.) 10:28
To:dev@flink.apache.org 
Subject:Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

Congratulations Andery!
Best,
Congxian


Kurt Young  于2019年8月15日周四 上午10:12写道:

> Congratulations Andery!
>
> Best,
> Kurt
>
>
> On Thu, Aug 15, 2019 at 10:09 AM Biao Liu  wrote:
>
> > Congrats!
> >
> > Thanks,
> > Biao /'bɪ.aʊ/
> >
> >
> >
> > On Thu, 15 Aug 2019 at 10:03, Jark Wu  wrote:
> >
> > > Congratulations Andrey!
> > >
> > >
> > > Cheers,
> > > Jark
> > >
> > > On Thu, 15 Aug 2019 at 00:57, jincheng sun 
> > > wrote:
> > >
> > > > Congrats Andrey! Very happy to have you onboard :)
> > > >
> > > > Best, Jincheng
> > > >
> > > > Yu Li  于2019年8月15日周四 上午12:06写道:
> > > >
> > > > > Congratulations Andrey! Well deserved!
> > > > >
> > > > > Best Regards,
> > > > > Yu
> > > > >
> > > > >
> > > > > On Wed, 14 Aug 2019 at 17:55, Aleksey Pak 
> > > wrote:
> > > > >
> > > > > > Congratulations, Andrey!
> > > > > >
> > > > > > On Wed, Aug 14, 2019 at 4:53 PM Markos Sfikas <
> > mark.sfi...@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Congrats Andrey!
> > > > > > >
> > > > > > > On Wed, 14 Aug 2019 at 16:47, Becket Qin  >
> > > > wrote:
> > > > > > >
> > > > > > > > Congratulations, Andrey!
> > > > > > > >
> > > > > > > > On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise  >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Congrats!
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Aug 14, 2019, 7:12 AM Robert Metzger <
> > > > rmetz...@apache.org>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Congratulations! Very happy to have you onboard :)
> > > > > > > > > >
> > > > > > > > > > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas <
> > > > > kklou...@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > Well deserved!
> > > > > > > > > > >
> > > > > > > > > > > Kostas
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang <
> > myas...@live.com
> > > >
> > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Congratulations Andrey.
> > > > > > > > > > > >
> > > > > > > > > > > > Best
> > > > > > > > > > > > Yun Tang
> > > > > > > > > > > > 
> > > > > > > > > > > > From: Xintong Song 
> > > > > > > > > > > > Sent: Wednesday, August 14, 2019 21:40
> > > > > > > > > > > > To: Oytun Tez 
> > > > > > > > > > > > Cc: Zili Chen ; Till Rohrmann
> <
> > > > > > > > > > > trohrm...@apache.org>; dev ;
> user
> > <
> > > > > > > > > > > u...@flink.apache.org>
> > > > > > > > > > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a
> Flink
> > > > > > committer
> > > > > > > > > > > >
> > > > > > > > > > > > Congratulations Andery~!
> > > > > > > > > > > >
> > > > > > > > > > > > Thank you~
> > > > > > > > > > > >
> > > > > > > > > > > > Xintong Song
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez <
> > > > > oy...@motaword.com>
> > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > >
> > > > > > > > > > > > I am glad the Flink committer team is growing at
> such a
> > > > pace!
> > > > > > > > > > > >
> > > > > > > > > > > > ---
> > > > > > > > > > > > Oytun Tez
> > > > > > > > > > > >
> > > > > > > > > > > > M O T A W O R D
> > > > > > > > > > > > The World's Fastest Human Translation Platform.
> > > > > > > > > > > > oy...@motaword.com — www.motaword.com
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen <
> > > > > > wander4...@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > tison.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Till Rohrmann  于2019年8月14日周三
> > > > 下午9:26写道:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > >
> > > > > > > > > > > > I'm very happy to announce that Andrey Zagrebin
> > accepted
> > > > the
> > > > > > > offer
> > > > > > > > of
> > > > > > > > > > > the Flink PMC to become a committer of the Flink
> project.
> > > > > > > > > > > >
> > > > > > > > > > > > Andrey has been an active community member for more
> > than
> > > 15
> > > > > > > months.
> > > > > > > > > He
> > > > > > > > > > > has helped shaping numerous features such as State TTL,
> > > > > FRocksDB
> > > > > > > > > release,
> > > > > > > > > > > Shuffle service abstraction, FLIP-1, result partition
> > > > > management
> > > > > > > and
> > > > > > > > > > > various fixes/improvements. He's also frequently
> helping
> > > out
> > > > on
> > > > > > the
> > > > > > > > > > > 

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Congxian Qiu
Congratulations Andery!
Best,
Congxian


Kurt Young  于2019年8月15日周四 上午10:12写道:

> Congratulations Andery!
>
> Best,
> Kurt
>
>
> On Thu, Aug 15, 2019 at 10:09 AM Biao Liu  wrote:
>
> > Congrats!
> >
> > Thanks,
> > Biao /'bɪ.aʊ/
> >
> >
> >
> > On Thu, 15 Aug 2019 at 10:03, Jark Wu  wrote:
> >
> > > Congratulations Andrey!
> > >
> > >
> > > Cheers,
> > > Jark
> > >
> > > On Thu, 15 Aug 2019 at 00:57, jincheng sun 
> > > wrote:
> > >
> > > > Congrats Andrey! Very happy to have you onboard :)
> > > >
> > > > Best, Jincheng
> > > >
> > > > Yu Li  于2019年8月15日周四 上午12:06写道:
> > > >
> > > > > Congratulations Andrey! Well deserved!
> > > > >
> > > > > Best Regards,
> > > > > Yu
> > > > >
> > > > >
> > > > > On Wed, 14 Aug 2019 at 17:55, Aleksey Pak 
> > > wrote:
> > > > >
> > > > > > Congratulations, Andrey!
> > > > > >
> > > > > > On Wed, Aug 14, 2019 at 4:53 PM Markos Sfikas <
> > mark.sfi...@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Congrats Andrey!
> > > > > > >
> > > > > > > On Wed, 14 Aug 2019 at 16:47, Becket Qin  >
> > > > wrote:
> > > > > > >
> > > > > > > > Congratulations, Andrey!
> > > > > > > >
> > > > > > > > On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise  >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Congrats!
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Aug 14, 2019, 7:12 AM Robert Metzger <
> > > > rmetz...@apache.org>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Congratulations! Very happy to have you onboard :)
> > > > > > > > > >
> > > > > > > > > > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas <
> > > > > kklou...@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > Well deserved!
> > > > > > > > > > >
> > > > > > > > > > > Kostas
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang <
> > myas...@live.com
> > > >
> > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Congratulations Andrey.
> > > > > > > > > > > >
> > > > > > > > > > > > Best
> > > > > > > > > > > > Yun Tang
> > > > > > > > > > > > 
> > > > > > > > > > > > From: Xintong Song 
> > > > > > > > > > > > Sent: Wednesday, August 14, 2019 21:40
> > > > > > > > > > > > To: Oytun Tez 
> > > > > > > > > > > > Cc: Zili Chen ; Till Rohrmann
> <
> > > > > > > > > > > trohrm...@apache.org>; dev ;
> user
> > <
> > > > > > > > > > > u...@flink.apache.org>
> > > > > > > > > > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a
> Flink
> > > > > > committer
> > > > > > > > > > > >
> > > > > > > > > > > > Congratulations Andery~!
> > > > > > > > > > > >
> > > > > > > > > > > > Thank you~
> > > > > > > > > > > >
> > > > > > > > > > > > Xintong Song
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez <
> > > > > oy...@motaword.com>
> > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > >
> > > > > > > > > > > > I am glad the Flink committer team is growing at
> such a
> > > > pace!
> > > > > > > > > > > >
> > > > > > > > > > > > ---
> > > > > > > > > > > > Oytun Tez
> > > > > > > > > > > >
> > > > > > > > > > > > M O T A W O R D
> > > > > > > > > > > > The World's Fastest Human Translation Platform.
> > > > > > > > > > > > oy...@motaword.com — www.motaword.com
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen <
> > > > > > wander4...@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > tison.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Till Rohrmann  于2019年8月14日周三
> > > > 下午9:26写道:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > >
> > > > > > > > > > > > I'm very happy to announce that Andrey Zagrebin
> > accepted
> > > > the
> > > > > > > offer
> > > > > > > > of
> > > > > > > > > > > the Flink PMC to become a committer of the Flink
> project.
> > > > > > > > > > > >
> > > > > > > > > > > > Andrey has been an active community member for more
> > than
> > > 15
> > > > > > > months.
> > > > > > > > > He
> > > > > > > > > > > has helped shaping numerous features such as State TTL,
> > > > > FRocksDB
> > > > > > > > > release,
> > > > > > > > > > > Shuffle service abstraction, FLIP-1, result partition
> > > > > management
> > > > > > > and
> > > > > > > > > > > various fixes/improvements. He's also frequently
> helping
> > > out
> > > > on
> > > > > > the
> > > > > > > > > > > user@f.a.o mailing lists.
> > > > > > > > > > > >
> > > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > > >
> > > > > > > > > > > > Best, Till
> > > > > > > > > > > > (on behalf of the Flink PMC)
> > > > > > > > > > >
> > > > > > > > > >
> > > 

[jira] [Created] (FLINK-13730) Cache and share the downloaded external distribution (e.g. Kafka) in E2E tests

2019-08-14 Thread Jark Wu (JIRA)
Jark Wu created FLINK-13730:
---

 Summary: Cache and share the downloaded external distribution 
(e.g. Kafka) in E2E tests
 Key: FLINK-13730
 URL: https://issues.apache.org/jira/browse/FLINK-13730
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Kafka, Tests
Reporter: Jark Wu


We can try to cache and share the downloaded Kafka distribution among multiple 
e2e test runs.
There is a PR by Chesnay Schepler (https://github.com/apache/flink/pull/7605) 
which conceptually does it. 

This will help to reduce a lot of e2e testing time. And it would be very help 
for Chinese developers to verify e2e locally, because it's very unstable and 
almost impossible to download them in China.

We can do the same thing for other external distributions, for example, 
elasticsearch.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Kurt Young
Congratulations Andery!

Best,
Kurt


On Thu, Aug 15, 2019 at 10:09 AM Biao Liu  wrote:

> Congrats!
>
> Thanks,
> Biao /'bɪ.aʊ/
>
>
>
> On Thu, 15 Aug 2019 at 10:03, Jark Wu  wrote:
>
> > Congratulations Andrey!
> >
> >
> > Cheers,
> > Jark
> >
> > On Thu, 15 Aug 2019 at 00:57, jincheng sun 
> > wrote:
> >
> > > Congrats Andrey! Very happy to have you onboard :)
> > >
> > > Best, Jincheng
> > >
> > > Yu Li  于2019年8月15日周四 上午12:06写道:
> > >
> > > > Congratulations Andrey! Well deserved!
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > > >
> > > > On Wed, 14 Aug 2019 at 17:55, Aleksey Pak 
> > wrote:
> > > >
> > > > > Congratulations, Andrey!
> > > > >
> > > > > On Wed, Aug 14, 2019 at 4:53 PM Markos Sfikas <
> mark.sfi...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Congrats Andrey!
> > > > > >
> > > > > > On Wed, 14 Aug 2019 at 16:47, Becket Qin 
> > > wrote:
> > > > > >
> > > > > > > Congratulations, Andrey!
> > > > > > >
> > > > > > > On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise 
> > > wrote:
> > > > > > >
> > > > > > > > Congrats!
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Aug 14, 2019, 7:12 AM Robert Metzger <
> > > rmetz...@apache.org>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Congratulations! Very happy to have you onboard :)
> > > > > > > > >
> > > > > > > > > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas <
> > > > kklou...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > Well deserved!
> > > > > > > > > >
> > > > > > > > > > Kostas
> > > > > > > > > >
> > > > > > > > > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang <
> myas...@live.com
> > >
> > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > Congratulations Andrey.
> > > > > > > > > > >
> > > > > > > > > > > Best
> > > > > > > > > > > Yun Tang
> > > > > > > > > > > 
> > > > > > > > > > > From: Xintong Song 
> > > > > > > > > > > Sent: Wednesday, August 14, 2019 21:40
> > > > > > > > > > > To: Oytun Tez 
> > > > > > > > > > > Cc: Zili Chen ; Till Rohrmann <
> > > > > > > > > > trohrm...@apache.org>; dev ; user
> <
> > > > > > > > > > u...@flink.apache.org>
> > > > > > > > > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink
> > > > > committer
> > > > > > > > > > >
> > > > > > > > > > > Congratulations Andery~!
> > > > > > > > > > >
> > > > > > > > > > > Thank you~
> > > > > > > > > > >
> > > > > > > > > > > Xintong Song
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez <
> > > > oy...@motaword.com>
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > >
> > > > > > > > > > > I am glad the Flink committer team is growing at such a
> > > pace!
> > > > > > > > > > >
> > > > > > > > > > > ---
> > > > > > > > > > > Oytun Tez
> > > > > > > > > > >
> > > > > > > > > > > M O T A W O R D
> > > > > > > > > > > The World's Fastest Human Translation Platform.
> > > > > > > > > > > oy...@motaword.com — www.motaword.com
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen <
> > > > > wander4...@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > tison.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Till Rohrmann  于2019年8月14日周三
> > > 下午9:26写道:
> > > > > > > > > > >
> > > > > > > > > > > Hi everyone,
> > > > > > > > > > >
> > > > > > > > > > > I'm very happy to announce that Andrey Zagrebin
> accepted
> > > the
> > > > > > offer
> > > > > > > of
> > > > > > > > > > the Flink PMC to become a committer of the Flink project.
> > > > > > > > > > >
> > > > > > > > > > > Andrey has been an active community member for more
> than
> > 15
> > > > > > months.
> > > > > > > > He
> > > > > > > > > > has helped shaping numerous features such as State TTL,
> > > > FRocksDB
> > > > > > > > release,
> > > > > > > > > > Shuffle service abstraction, FLIP-1, result partition
> > > > management
> > > > > > and
> > > > > > > > > > various fixes/improvements. He's also frequently helping
> > out
> > > on
> > > > > the
> > > > > > > > > > user@f.a.o mailing lists.
> > > > > > > > > > >
> > > > > > > > > > > Congratulations Andrey!
> > > > > > > > > > >
> > > > > > > > > > > Best, Till
> > > > > > > > > > > (on behalf of the Flink PMC)
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Markos Sfikas
> > > > > > +49 (0) 15759630002
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Biao Liu
Congrats!

Thanks,
Biao /'bɪ.aʊ/



On Thu, 15 Aug 2019 at 10:03, Jark Wu  wrote:

> Congratulations Andrey!
>
>
> Cheers,
> Jark
>
> On Thu, 15 Aug 2019 at 00:57, jincheng sun 
> wrote:
>
> > Congrats Andrey! Very happy to have you onboard :)
> >
> > Best, Jincheng
> >
> > Yu Li  于2019年8月15日周四 上午12:06写道:
> >
> > > Congratulations Andrey! Well deserved!
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Wed, 14 Aug 2019 at 17:55, Aleksey Pak 
> wrote:
> > >
> > > > Congratulations, Andrey!
> > > >
> > > > On Wed, Aug 14, 2019 at 4:53 PM Markos Sfikas  >
> > > > wrote:
> > > >
> > > > > Congrats Andrey!
> > > > >
> > > > > On Wed, 14 Aug 2019 at 16:47, Becket Qin 
> > wrote:
> > > > >
> > > > > > Congratulations, Andrey!
> > > > > >
> > > > > > On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise 
> > wrote:
> > > > > >
> > > > > > > Congrats!
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Aug 14, 2019, 7:12 AM Robert Metzger <
> > rmetz...@apache.org>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Congratulations! Very happy to have you onboard :)
> > > > > > > >
> > > > > > > > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas <
> > > kklou...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Congratulations Andrey!
> > > > > > > > > Well deserved!
> > > > > > > > >
> > > > > > > > > Kostas
> > > > > > > > >
> > > > > > > > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang  >
> > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Congratulations Andrey.
> > > > > > > > > >
> > > > > > > > > > Best
> > > > > > > > > > Yun Tang
> > > > > > > > > > 
> > > > > > > > > > From: Xintong Song 
> > > > > > > > > > Sent: Wednesday, August 14, 2019 21:40
> > > > > > > > > > To: Oytun Tez 
> > > > > > > > > > Cc: Zili Chen ; Till Rohrmann <
> > > > > > > > > trohrm...@apache.org>; dev ; user <
> > > > > > > > > u...@flink.apache.org>
> > > > > > > > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink
> > > > committer
> > > > > > > > > >
> > > > > > > > > > Congratulations Andery~!
> > > > > > > > > >
> > > > > > > > > > Thank you~
> > > > > > > > > >
> > > > > > > > > > Xintong Song
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez <
> > > oy...@motaword.com>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Congratulations Andrey!
> > > > > > > > > >
> > > > > > > > > > I am glad the Flink committer team is growing at such a
> > pace!
> > > > > > > > > >
> > > > > > > > > > ---
> > > > > > > > > > Oytun Tez
> > > > > > > > > >
> > > > > > > > > > M O T A W O R D
> > > > > > > > > > The World's Fastest Human Translation Platform.
> > > > > > > > > > oy...@motaword.com — www.motaword.com
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen <
> > > > wander4...@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Congratulations Andrey!
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > tison.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Till Rohrmann  于2019年8月14日周三
> > 下午9:26写道:
> > > > > > > > > >
> > > > > > > > > > Hi everyone,
> > > > > > > > > >
> > > > > > > > > > I'm very happy to announce that Andrey Zagrebin accepted
> > the
> > > > > offer
> > > > > > of
> > > > > > > > > the Flink PMC to become a committer of the Flink project.
> > > > > > > > > >
> > > > > > > > > > Andrey has been an active community member for more than
> 15
> > > > > months.
> > > > > > > He
> > > > > > > > > has helped shaping numerous features such as State TTL,
> > > FRocksDB
> > > > > > > release,
> > > > > > > > > Shuffle service abstraction, FLIP-1, result partition
> > > management
> > > > > and
> > > > > > > > > various fixes/improvements. He's also frequently helping
> out
> > on
> > > > the
> > > > > > > > > user@f.a.o mailing lists.
> > > > > > > > > >
> > > > > > > > > > Congratulations Andrey!
> > > > > > > > > >
> > > > > > > > > > Best, Till
> > > > > > > > > > (on behalf of the Flink PMC)
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Markos Sfikas
> > > > > +49 (0) 15759630002
> > > > >
> > > >
> > >
> >
>


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Jark Wu
Congratulations Andrey!


Cheers,
Jark

On Thu, 15 Aug 2019 at 00:57, jincheng sun  wrote:

> Congrats Andrey! Very happy to have you onboard :)
>
> Best, Jincheng
>
> Yu Li  于2019年8月15日周四 上午12:06写道:
>
> > Congratulations Andrey! Well deserved!
> >
> > Best Regards,
> > Yu
> >
> >
> > On Wed, 14 Aug 2019 at 17:55, Aleksey Pak  wrote:
> >
> > > Congratulations, Andrey!
> > >
> > > On Wed, Aug 14, 2019 at 4:53 PM Markos Sfikas 
> > > wrote:
> > >
> > > > Congrats Andrey!
> > > >
> > > > On Wed, 14 Aug 2019 at 16:47, Becket Qin 
> wrote:
> > > >
> > > > > Congratulations, Andrey!
> > > > >
> > > > > On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise 
> wrote:
> > > > >
> > > > > > Congrats!
> > > > > >
> > > > > >
> > > > > > On Wed, Aug 14, 2019, 7:12 AM Robert Metzger <
> rmetz...@apache.org>
> > > > > wrote:
> > > > > >
> > > > > > > Congratulations! Very happy to have you onboard :)
> > > > > > >
> > > > > > > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas <
> > kklou...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Congratulations Andrey!
> > > > > > > > Well deserved!
> > > > > > > >
> > > > > > > > Kostas
> > > > > > > >
> > > > > > > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang 
> > > wrote:
> > > > > > > > >
> > > > > > > > > Congratulations Andrey.
> > > > > > > > >
> > > > > > > > > Best
> > > > > > > > > Yun Tang
> > > > > > > > > 
> > > > > > > > > From: Xintong Song 
> > > > > > > > > Sent: Wednesday, August 14, 2019 21:40
> > > > > > > > > To: Oytun Tez 
> > > > > > > > > Cc: Zili Chen ; Till Rohrmann <
> > > > > > > > trohrm...@apache.org>; dev ; user <
> > > > > > > > u...@flink.apache.org>
> > > > > > > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink
> > > committer
> > > > > > > > >
> > > > > > > > > Congratulations Andery~!
> > > > > > > > >
> > > > > > > > > Thank you~
> > > > > > > > >
> > > > > > > > > Xintong Song
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez <
> > oy...@motaword.com>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Congratulations Andrey!
> > > > > > > > >
> > > > > > > > > I am glad the Flink committer team is growing at such a
> pace!
> > > > > > > > >
> > > > > > > > > ---
> > > > > > > > > Oytun Tez
> > > > > > > > >
> > > > > > > > > M O T A W O R D
> > > > > > > > > The World's Fastest Human Translation Platform.
> > > > > > > > > oy...@motaword.com — www.motaword.com
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen <
> > > wander4...@gmail.com>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Congratulations Andrey!
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > tison.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Till Rohrmann  于2019年8月14日周三
> 下午9:26写道:
> > > > > > > > >
> > > > > > > > > Hi everyone,
> > > > > > > > >
> > > > > > > > > I'm very happy to announce that Andrey Zagrebin accepted
> the
> > > > offer
> > > > > of
> > > > > > > > the Flink PMC to become a committer of the Flink project.
> > > > > > > > >
> > > > > > > > > Andrey has been an active community member for more than 15
> > > > months.
> > > > > > He
> > > > > > > > has helped shaping numerous features such as State TTL,
> > FRocksDB
> > > > > > release,
> > > > > > > > Shuffle service abstraction, FLIP-1, result partition
> > management
> > > > and
> > > > > > > > various fixes/improvements. He's also frequently helping out
> on
> > > the
> > > > > > > > user@f.a.o mailing lists.
> > > > > > > > >
> > > > > > > > > Congratulations Andrey!
> > > > > > > > >
> > > > > > > > > Best, Till
> > > > > > > > > (on behalf of the Flink PMC)
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Markos Sfikas
> > > > +49 (0) 15759630002
> > > >
> > >
> >
>


Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-14 Thread Kurt Young
Hi Robert,

I will do it today.

Best,
Kurt


On Wed, Aug 14, 2019 at 11:55 PM Robert Metzger  wrote:

> Has anybody verified the inclusion of all bundled dependencies into the
> NOTICE files?
>
> I'm asking because we had some issues with that in the last release(s).
>
> On Wed, Aug 14, 2019 at 5:31 PM Aljoscha Krettek 
> wrote:
>
> > +1
> >
> > I did some testing on a Google Cloud Dataproc cluster (it gives you a
> > managed YARN and Google Cloud Storage (GCS)):
> >   - tried both YARN session mode and YARN per-job mode, also using
> > bin/flink list/cancel/etc. against a YARN session cluster
> >   - ran examples that write to GCS, both with the native Hadoop
> FileSystem
> > and a custom “plugin” FileSystem
> >   - ran stateful streaming jobs that use GCS as a checkpoint backend
> >   - tried running SQL programs on YARN using the SQL Cli: this worked for
> > YARN session mode but not for YARN per-job mode. Looking at the code I
> > don’t think per-job mode would work from seeing how it is implemented.
> But
> > I think it’s an OK restriction to have for now
> >   - in all the testing I had fine-grained recovery (region failover)
> > enabled but I didn’t simulate any failures
> >
> > > On 14. Aug 2019, at 15:20, Kurt Young  wrote:
> > >
> > > Hi,
> > >
> > > Thanks for preparing this release candidate. I have verified the
> > following:
> > >
> > > - verified the checksums and GPG files match the corresponding release
> > files
> > > - verified that the source archives do not contains any binaries
> > > - build the source release with Scala 2.11 successfully.
> > > - ran `mvn verify` locally, met 2 issuses [FLINK-13687] and
> > [FLINK-13688],
> > > but
> > > both are not release blockers. Other than that, all tests are passed.
> > > - ran all e2e tests which don't need download external packages (it's
> > very
> > > unstable
> > > in China and almost impossible to download them), all passed.
> > > - started local cluster, ran some examples. Met a small website display
> > > issue
> > > [FLINK-13591], which is also not a release blocker.
> > >
> > > Although we have pushed some fixes around blink planner and hive
> > > integration
> > > after RC2, but consider these are both preview features, I'm lean to be
> > ok
> > > to release
> > > without these fixes.
> > >
> > > +1 from my side. (binding)
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Wed, Aug 14, 2019 at 5:13 PM Jark Wu  wrote:
> > >
> > >> Hi Gordon,
> > >>
> > >> I have verified the following things:
> > >>
> > >> - build the source release with Scala 2.12 and Scala 2.11 successfully
> > >> - checked/verified signatures and hashes
> > >> - checked that all POM files point to the same version
> > >> - ran some flink table related end-to-end tests locally and succeeded
> > >> (except TPC-H e2e failed which is reported in FLINK-13704)
> > >> - started cluster for both Scala 2.11 and 2.12, ran examples, verified
> > web
> > >> ui and log output, nothing unexpected
> > >> - started cluster, ran a SQL query to temporal join with kafka source
> > and
> > >> mysql jdbc table, and write results to kafka again. Using DDL to
> create
> > the
> > >> source and sinks. looks good.
> > >> - reviewed the release PR
> > >>
> > >> As FLINK-13704 is not recognized as blocker issue, so +1 from my side
> > >> (non-binding).
> > >>
> > >> On Tue, 13 Aug 2019 at 17:07, Till Rohrmann 
> > wrote:
> > >>
> > >>> Hi Richard,
> > >>>
> > >>> although I can see that it would be handy for users who have PubSub
> set
> > >> up,
> > >>> I would rather not include examples which require an external
> > dependency
> > >>> into the Flink distribution. I think examples should be
> self-contained.
> > >> My
> > >>> concern is that we would bloat the distribution for many users at the
> > >>> benefit of a few. Instead, I think it would be better to make these
> > >>> examples available differently, maybe through Flink's ecosystem
> website
> > >> or
> > >>> maybe a new examples section in Flink's documentation.
> > >>>
> > >>> Cheers,
> > >>> Till
> > >>>
> > >>> On Tue, Aug 13, 2019 at 9:43 AM Jark Wu  wrote:
> > >>>
> >  Hi Till,
> > 
> >  After thinking about we can use VARCHAR as an alternative of
> >  timestamp/time/date.
> >  I'm fine with not recognize it as a blocker issue.
> >  We can fix it into 1.9.1.
> > 
> > 
> >  Thanks,
> >  Jark
> > 
> > 
> >  On Tue, 13 Aug 2019 at 15:10, Richard Deurwaarder 
> > >>> wrote:
> > 
> > > Hello all,
> > >
> > > I noticed the PubSub example jar is not included in the examples/
> dir
> > >>> of
> > > flink-dist. I've created
> >  https://issues.apache.org/jira/browse/FLINK-13700
> > > + https://github.com/apache/flink/pull/9424/files to fix this.
> > >
> > > I will leave it up to you to decide if we want to add this to
> 1.9.0.
> > >
> > > Regards,
> > >
> > > Richard
> > >
> > > On Tue, Aug 13, 2019 at 9:04 AM Till 

[jira] [Created] (FLINK-13729) Update website generation dependencies

2019-08-14 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-13729:
---

 Summary: Update website generation dependencies
 Key: FLINK-13729
 URL: https://issues.apache.org/jira/browse/FLINK-13729
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.10.0
Reporter: Nico Kruber
Assignee: Nico Kruber


The website generation dependencies are quite old. By upgrading some of them we 
get improvements like a much nicer code highlighting and prepare for the jekyll 
update of FLINK-13726 and FLINK-13727.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13728) Fix wrong closing tag order in sidenav

2019-08-14 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-13728:
---

 Summary: Fix wrong closing tag order in sidenav
 Key: FLINK-13728
 URL: https://issues.apache.org/jira/browse/FLINK-13728
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.8.1, 1.9.0, 1.10.0
Reporter: Nico Kruber
Assignee: Nico Kruber


The order of closing HTML tags in the sidenav is wrong: instead of 
{{}} it should be {{}}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13727) Build docs with jekyll 4.0.0 (final)

2019-08-14 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-13727:
---

 Summary: Build docs with jekyll 4.0.0 (final)
 Key: FLINK-13727
 URL: https://issues.apache.org/jira/browse/FLINK-13727
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.10.0
Reporter: Nico Kruber


When Jekyll 4.0.0 is out, we should upgrade to this final version and 
discontinue using the beta.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13726) Build docs with jekyll 4.0.0.pre.beta1

2019-08-14 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-13726:
---

 Summary: Build docs with jekyll 4.0.0.pre.beta1
 Key: FLINK-13726
 URL: https://issues.apache.org/jira/browse/FLINK-13726
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.10.0
Reporter: Nico Kruber
Assignee: Nico Kruber


Jekyll 4 is way faster in generating the docs than jekyll 3 - probably due to 
the newly introduced cache. Site generation time goes down by roughly a factor 
of 2.5!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13725) Use sassc for faster doc generation

2019-08-14 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-13725:
---

 Summary: Use sassc for faster doc generation
 Key: FLINK-13725
 URL: https://issues.apache.org/jira/browse/FLINK-13725
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.10.0
Reporter: Nico Kruber
Assignee: Nico Kruber


Jekyll requires {{sass}} but can optionally also use a C-based implementation 
provided by {{sassc}}. Although we do not use sass directly, there may be some 
indirect use inside jekyll. It doesn't seem to hurt to upgrade here.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13724) Remove unnecessary whitespace from the docs' sitenav

2019-08-14 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-13724:
---

 Summary: Remove unnecessary whitespace from the docs' sitenav
 Key: FLINK-13724
 URL: https://issues.apache.org/jira/browse/FLINK-13724
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.10.0
Reporter: Nico Kruber
Assignee: Nico Kruber


The site navigation generates quite some white space that will end up in every 
HTML page. Removing this reduces final page sizes and also improved site 
generation speed.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13723) Use liquid-c for faster doc generation

2019-08-14 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-13723:
---

 Summary: Use liquid-c for faster doc generation
 Key: FLINK-13723
 URL: https://issues.apache.org/jira/browse/FLINK-13723
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.10.0
Reporter: Nico Kruber
Assignee: Nico Kruber


Jekyll requires {{liquid}} and only optionally uses {{liquid-c}} if available. 
The latter uses natively-compiled code and reduces generation time by ~5% for 
me.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: Checkpointing under backpressure

2019-08-14 Thread Thomas Weise
-->

On Wed, Aug 14, 2019 at 10:23 AM zhijiang
 wrote:

> Thanks for these great points and disccusions!
>
> 1. Considering the way of triggering checkpoint RPC calls to all the tasks
> from Chandy Lamport, it combines two different mechanisms together to make
> sure that the trigger could be fast in different scenarios.
> But in flink world it might be not very worth trying that way, just as
> Stephan's analysis for it. Another concern is that it might bring more
> heavy loads for JobMaster broadcasting this checkpoint RPC to all the tasks
> in large scale job, especially for the very short checkpoint interval.
> Furthermore it would also cause other important RPC to be executed delay to
> bring potentail timeout risks.
>
> 2. I agree with the idea of drawing on the way "take state snapshot on
> first barrier" from Chandy Lamport instead of barrier alignment combining
> with unaligned checkpoints in flink.
>
> >  The benefit would be less latency increase in the channels which
> already have received barriers.
> >  However, as mentioned before, not prioritizing the inputs from
> which barriers are still missing can also have an adverse effect.
>
> I think we will not have an adverse effect if not prioritizing the inputs
> w/o barriers in this case. After sync snapshot, the task could actually
> process any input channels. For the input channel receiving the first
> barrier, we already have the obvious boundary for persisting buffers. For
> other channels w/o barriers we could persist the following buffers for
> these channels until barrier arrives in network. Because based on the
> credit based flow control, the barrier does not need credit to transport,
> then as long as the sender overtakes the barrier accross the output queue,
> the network stack would transport this barrier immediately no matter with
> the inputs condition on receiver side. So there is no requirements to
> consume accumulated buffers in these channels for higher priority. If so it
> seems that we will not waste any CPU cycles as Piotr concerns before.
>

I'm not sure I follow this. For the checkpoint to complete, any buffer that
arrived prior to the barrier would be to be part of the checkpointed state.
So wouldn't it be important to finish persisting these buffers as fast as
possible by prioritizing respective inputs? The task won't be able to
process records from the inputs that have seen the barrier fast when it is
already backpressured (or causing the backpressure).


>
> 3. Suppose the unaligned checkpoints performing well in practice, is it
> possible to make it as the only mechanism for handling all the cases? I
> mean for the non-backpressure scenario, there are less buffers even empty
> in input/output queue, then the "overtaking barrier--> trigger snapshot on
> first barrier--> persist buffers" might still work well. So we do not need
> to maintain two suits of mechanisms finally.
>
> 4.  The initial motivation of this dicussion is for checkpoint timeout in
> backpressure scenario. If we adjust the default timeout to a very big
> value, that means the checkpoint would never timeout and we only need to
> wait it finish. Then are there still any other problems/concerns if
> checkpoint takes long time to finish? Althougn we already knew some issues
> before, it is better to gather more user feedbacks to confirm which aspects
> could be solved in this feature design. E.g. the sink commit delay might
> not be coverd by unaligned solution.
>

Checkpoints taking too long is the concern that sparks this discussion
(timeout is just a symptom). The slowness issue also applies to the
savepoint use case. We would need to be able to take a savepoint fast in
order to roll forward a fix that can alleviate the backpressure (like
changing parallelism or making a different configuration change).


>
> Best,
> Zhijiang
> --
> From:Stephan Ewen 
> Send Time:2019年8月14日(星期三) 17:43
> To:dev 
> Subject:Re: Checkpointing under backpressure
>
> Quick note: The current implementation is
>
> Align -> Forward -> Sync Snapshot Part (-> Async Snapshot Part)
>
> On Wed, Aug 14, 2019 at 5:21 PM Piotr Nowojski 
> wrote:
>
> > > Thanks for the great ideas so far.
> >
> > +1
> >
> > Regarding other things raised, I mostly agree with Stephan.
> >
> > I like the idea of simultaneously starting the checkpoint everywhere via
> > RPC call (especially in cases where Tasks are busy doing some synchronous
> > operations for example for tens of milliseconds. In that case every
> network
> > exchange adds tens of milliseconds of delay in propagating the
> checkpoint).
> > However I agree that this might be a premature optimisation assuming the
> > current state of our code (we already have checkpoint barriers).
> >
> > However I like the idea of switching from:
> >
> > 1. A -> S -> F (Align -> snapshot -> forward markers)
> >
> > To
> >
> > 2. S -> F -> L (Snapshot -> forward markers -> log pending 

Re: Checkpointing under backpressure

2019-08-14 Thread zhijiang
Thanks for these great points and disccusions!

1. Considering the way of triggering checkpoint RPC calls to all the tasks from 
Chandy Lamport, it combines two different mechanisms together to make sure that 
the trigger could be fast in different scenarios.
But in flink world it might be not very worth trying that way, just as 
Stephan's analysis for it. Another concern is that it might bring more heavy 
loads for JobMaster broadcasting this checkpoint RPC to all the tasks in large 
scale job, especially for the very short checkpoint interval. Furthermore it 
would also cause other important RPC to be executed delay to bring potentail 
timeout risks.

2. I agree with the idea of drawing on the way "take state snapshot on first 
barrier" from Chandy Lamport instead of barrier alignment combining with 
unaligned checkpoints in flink.

>  The benefit would be less latency increase in the channels which already 
>  have received barriers.
>  However, as mentioned before, not prioritizing the inputs from which 
>  barriers are still missing can also have an adverse effect.

I think we will not have an adverse effect if not prioritizing the inputs w/o 
barriers in this case. After sync snapshot, the task could actually process any 
input channels. For the input channel receiving the first barrier, we already 
have the obvious boundary for persisting buffers. For other channels w/o 
barriers we could persist the following buffers for these channels until 
barrier arrives in network. Because based on the credit based flow control, the 
barrier does not need credit to transport, then as long as the sender overtakes 
the barrier accross the output queue, the network stack would transport this 
barrier immediately no matter with the inputs condition on receiver side. So 
there is no requirements to consume accumulated buffers in these channels for 
higher priority. If so it seems that we will not waste any CPU cycles as Piotr 
concerns before.

3. Suppose the unaligned checkpoints performing well in practice, is it 
possible to make it as the only mechanism for handling all the cases? I mean 
for the non-backpressure scenario, there are less buffers even empty in 
input/output queue, then the "overtaking barrier--> trigger snapshot on first 
barrier--> persist buffers" might still work well. So we do not need to 
maintain two suits of mechanisms finally.

4.  The initial motivation of this dicussion is for checkpoint timeout in 
backpressure scenario. If we adjust the default timeout to a very big value, 
that means the checkpoint would never timeout and we only need to wait it 
finish. Then are there still any other problems/concerns if checkpoint takes 
long time to finish? Althougn we already knew some issues before, it is better 
to gather more user feedbacks to confirm which aspects could be solved in this 
feature design. E.g. the sink commit delay might not be coverd by unaligned 
solution.

Best,
Zhijiang
--
From:Stephan Ewen 
Send Time:2019年8月14日(星期三) 17:43
To:dev 
Subject:Re: Checkpointing under backpressure

Quick note: The current implementation is

Align -> Forward -> Sync Snapshot Part (-> Async Snapshot Part)

On Wed, Aug 14, 2019 at 5:21 PM Piotr Nowojski  wrote:

> > Thanks for the great ideas so far.
>
> +1
>
> Regarding other things raised, I mostly agree with Stephan.
>
> I like the idea of simultaneously starting the checkpoint everywhere via
> RPC call (especially in cases where Tasks are busy doing some synchronous
> operations for example for tens of milliseconds. In that case every network
> exchange adds tens of milliseconds of delay in propagating the checkpoint).
> However I agree that this might be a premature optimisation assuming the
> current state of our code (we already have checkpoint barriers).
>
> However I like the idea of switching from:
>
> 1. A -> S -> F (Align -> snapshot -> forward markers)
>
> To
>
> 2. S -> F -> L (Snapshot -> forward markers -> log pending channels)
>
> Or even to
>
> 6. F -> S -> L (Forward markers -> snapshot -> log pending channels)
>
> It feels to me like this would decouple propagation of checkpoints from
> costs of synchronous snapshots and waiting for all of the checkpoint
> barriers to arrive (even if they will overtake in-flight records, this
> might take some time).
>
> > What I like about the Chandy Lamport approach (2.) initiated from
> sources is that:
> >   - Snapshotting imposes no modification to normal processing.
>
> Yes, I agree that would be nice. Currently, during the alignment and
> blocking of the input channels, we might be wasting CPU cycles of up stream
> tasks. If we succeed in designing new checkpointing mechanism to not
> disrupt/block regular data processing (% the extra IO cost for logging the
> in-flight records), that would be a huge improvement.
>
> Piotrek
>
> > On 14 Aug 2019, at 14:56, Paris Carbone  wrote:
> >
> > Sure I see. In 

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread jincheng sun
Congrats Andrey! Very happy to have you onboard :)

Best, Jincheng

Yu Li  于2019年8月15日周四 上午12:06写道:

> Congratulations Andrey! Well deserved!
>
> Best Regards,
> Yu
>
>
> On Wed, 14 Aug 2019 at 17:55, Aleksey Pak  wrote:
>
> > Congratulations, Andrey!
> >
> > On Wed, Aug 14, 2019 at 4:53 PM Markos Sfikas 
> > wrote:
> >
> > > Congrats Andrey!
> > >
> > > On Wed, 14 Aug 2019 at 16:47, Becket Qin  wrote:
> > >
> > > > Congratulations, Andrey!
> > > >
> > > > On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise  wrote:
> > > >
> > > > > Congrats!
> > > > >
> > > > >
> > > > > On Wed, Aug 14, 2019, 7:12 AM Robert Metzger 
> > > > wrote:
> > > > >
> > > > > > Congratulations! Very happy to have you onboard :)
> > > > > >
> > > > > > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas <
> kklou...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Congratulations Andrey!
> > > > > > > Well deserved!
> > > > > > >
> > > > > > > Kostas
> > > > > > >
> > > > > > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang 
> > wrote:
> > > > > > > >
> > > > > > > > Congratulations Andrey.
> > > > > > > >
> > > > > > > > Best
> > > > > > > > Yun Tang
> > > > > > > > 
> > > > > > > > From: Xintong Song 
> > > > > > > > Sent: Wednesday, August 14, 2019 21:40
> > > > > > > > To: Oytun Tez 
> > > > > > > > Cc: Zili Chen ; Till Rohrmann <
> > > > > > > trohrm...@apache.org>; dev ; user <
> > > > > > > u...@flink.apache.org>
> > > > > > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink
> > committer
> > > > > > > >
> > > > > > > > Congratulations Andery~!
> > > > > > > >
> > > > > > > > Thank you~
> > > > > > > >
> > > > > > > > Xintong Song
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez <
> oy...@motaword.com>
> > > > > wrote:
> > > > > > > >
> > > > > > > > Congratulations Andrey!
> > > > > > > >
> > > > > > > > I am glad the Flink committer team is growing at such a pace!
> > > > > > > >
> > > > > > > > ---
> > > > > > > > Oytun Tez
> > > > > > > >
> > > > > > > > M O T A W O R D
> > > > > > > > The World's Fastest Human Translation Platform.
> > > > > > > > oy...@motaword.com — www.motaword.com
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen <
> > wander4...@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > Congratulations Andrey!
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > tison.
> > > > > > > >
> > > > > > > >
> > > > > > > > Till Rohrmann  于2019年8月14日周三 下午9:26写道:
> > > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > I'm very happy to announce that Andrey Zagrebin accepted the
> > > offer
> > > > of
> > > > > > > the Flink PMC to become a committer of the Flink project.
> > > > > > > >
> > > > > > > > Andrey has been an active community member for more than 15
> > > months.
> > > > > He
> > > > > > > has helped shaping numerous features such as State TTL,
> FRocksDB
> > > > > release,
> > > > > > > Shuffle service abstraction, FLIP-1, result partition
> management
> > > and
> > > > > > > various fixes/improvements. He's also frequently helping out on
> > the
> > > > > > > user@f.a.o mailing lists.
> > > > > > > >
> > > > > > > > Congratulations Andrey!
> > > > > > > >
> > > > > > > > Best, Till
> > > > > > > > (on behalf of the Flink PMC)
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Markos Sfikas
> > > +49 (0) 15759630002
> > >
> >
>


Re: [DISCUSS] Flink Python User-Defined Function for Table API

2019-08-14 Thread jincheng sun
Hi Thomas,

Thanks for your confirmation and the very important reminder about bundle
processing.

I have had add the description about how to perform bundle processing from
the perspective of checkpoint and watermark. Feel free to leave comments if
there are anything not describe clearly.

Best,
Jincheng


Dian Fu  于2019年8月14日周三 上午10:08写道:

> Hi Thomas,
>
> Thanks a lot the suggestions.
>
> Regarding to bundle processing, there is a section "Checkpoint"[1] in the
> design doc which talks about how to handle the checkpoint.
> However, I think you are right that we should talk more about it, such as
> what's bundle processing, how it affects the checkpoint and watermark, how
> to handle the checkpoint and watermark, etc.
>
> [1]
> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3
> <
> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3
> >
>
> Regards,
> Dian
>
> > 在 2019年8月14日,上午1:01,Thomas Weise  写道:
> >
> > Hi Jincheng,
> >
> > Thanks for putting this together. The proposal is very detailed, thorough
> > and for me as a Beam Flink runner contributor easy to understand :)
> >
> > One thing that you should probably detail more is the bundle processing.
> It
> > is critically important for performance that multiple elements are
> > processed in a bundle. The default bundle size in the Flink runner is 1s
> or
> > 1000 elements, whichever comes first. And for streaming, you can find the
> > logic necessary to align the bundle processing with watermarks and
> > checkpointing here:
> >
> https://github.com/apache/beam/blob/release-2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java
> >
> > Thomas
> >
> >
> >
> >
> >
> >
> >
> > On Tue, Aug 13, 2019 at 7:05 AM jincheng sun 
> > wrote:
> >
> >> Hi all,
> >>
> >> The Python Table API(without Python UDF support) has already been
> supported
> >> and will be available in the coming release 1.9.
> >> As Python UDF is very important for Python users, we'd like to start the
> >> discussion about the Python UDF support in the Python Table API.
> >> Aljoscha Krettek, Dian Fu and I have discussed offline and have drafted
> a
> >> design doc[1]. It includes the following items:
> >>
> >> - The user-defined function interfaces.
> >> - The user-defined function execution architecture.
> >>
> >> As mentioned by many guys in the previous discussion thread[2], a
> >> portability framework was introduced in Apache Beam in latest releases.
> It
> >> provides well-defined, language-neutral data structures and protocols
> for
> >> language-neutral user-defined function execution. This design is based
> on
> >> Beam's portability framework. We will introduce how to make use of
> Beam's
> >> portability framework for user-defined function execution: data
> >> transmission, state access, checkpoint, metrics, logging, etc.
> >>
> >> Considering that the design relies on Beam's portability framework for
> >> Python user-defined function execution and not all the contributors in
> >> Flink community are familiar with Beam's portability framework, we have
> >> done a prototype[3] for proof of concept and also ease of understanding
> of
> >> the design.
> >>
> >> Welcome any feedback.
> >>
> >> Best,
> >> Jincheng
> >>
> >> [1]
> >>
> >>
> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit?usp=sharing
> >> [2]
> >>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html
> >> [3] https://github.com/dianfu/flink/commits/udf_poc
> >>
>
>


Re: Watermarks not propagated to WebUI?

2019-08-14 Thread Thomas Weise
I have also noticed this issue (Flink 1.5, Flink 1.8), and it appears with
higher parallelism.

This can be confusing to the user when watermarks actually work and can be
observed using the metrics.

On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský  wrote:

> Hi,
>
> is it possible, that watermarks are sometimes not propagated to WebUI,
> although they are internally moving as normal? I see in WebUI every
> operator showing "No Watermark", but outputs seem to be propagated to
> sink (and there are watermark sensitive operations involved - e.g.
> reductions on fixed windows without early emitting). More strangely,
> this happens when I increase parallelism above some threshold. If I use
> parallelism of N, watermarks are shown, when I increase it above some
> number (seems not to be exactly deterministic), watermarks seems to
> disappear.
>
> I'm using Flink 1.8.1.
>
> Did anyone experience something like this before?
>
> Jan
>
>


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Yu Li
Congratulations Andrey! Well deserved!

Best Regards,
Yu


On Wed, 14 Aug 2019 at 17:55, Aleksey Pak  wrote:

> Congratulations, Andrey!
>
> On Wed, Aug 14, 2019 at 4:53 PM Markos Sfikas 
> wrote:
>
> > Congrats Andrey!
> >
> > On Wed, 14 Aug 2019 at 16:47, Becket Qin  wrote:
> >
> > > Congratulations, Andrey!
> > >
> > > On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise  wrote:
> > >
> > > > Congrats!
> > > >
> > > >
> > > > On Wed, Aug 14, 2019, 7:12 AM Robert Metzger 
> > > wrote:
> > > >
> > > > > Congratulations! Very happy to have you onboard :)
> > > > >
> > > > > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas  >
> > > > wrote:
> > > > >
> > > > > > Congratulations Andrey!
> > > > > > Well deserved!
> > > > > >
> > > > > > Kostas
> > > > > >
> > > > > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang 
> wrote:
> > > > > > >
> > > > > > > Congratulations Andrey.
> > > > > > >
> > > > > > > Best
> > > > > > > Yun Tang
> > > > > > > 
> > > > > > > From: Xintong Song 
> > > > > > > Sent: Wednesday, August 14, 2019 21:40
> > > > > > > To: Oytun Tez 
> > > > > > > Cc: Zili Chen ; Till Rohrmann <
> > > > > > trohrm...@apache.org>; dev ; user <
> > > > > > u...@flink.apache.org>
> > > > > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink
> committer
> > > > > > >
> > > > > > > Congratulations Andery~!
> > > > > > >
> > > > > > > Thank you~
> > > > > > >
> > > > > > > Xintong Song
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez 
> > > > wrote:
> > > > > > >
> > > > > > > Congratulations Andrey!
> > > > > > >
> > > > > > > I am glad the Flink committer team is growing at such a pace!
> > > > > > >
> > > > > > > ---
> > > > > > > Oytun Tez
> > > > > > >
> > > > > > > M O T A W O R D
> > > > > > > The World's Fastest Human Translation Platform.
> > > > > > > oy...@motaword.com — www.motaword.com
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen <
> wander4...@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > Congratulations Andrey!
> > > > > > >
> > > > > > > Best,
> > > > > > > tison.
> > > > > > >
> > > > > > >
> > > > > > > Till Rohrmann  于2019年8月14日周三 下午9:26写道:
> > > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I'm very happy to announce that Andrey Zagrebin accepted the
> > offer
> > > of
> > > > > > the Flink PMC to become a committer of the Flink project.
> > > > > > >
> > > > > > > Andrey has been an active community member for more than 15
> > months.
> > > > He
> > > > > > has helped shaping numerous features such as State TTL, FRocksDB
> > > > release,
> > > > > > Shuffle service abstraction, FLIP-1, result partition management
> > and
> > > > > > various fixes/improvements. He's also frequently helping out on
> the
> > > > > > user@f.a.o mailing lists.
> > > > > > >
> > > > > > > Congratulations Andrey!
> > > > > > >
> > > > > > > Best, Till
> > > > > > > (on behalf of the Flink PMC)
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > Markos Sfikas
> > +49 (0) 15759630002
> >
>


Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-14 Thread Robert Metzger
Has anybody verified the inclusion of all bundled dependencies into the
NOTICE files?

I'm asking because we had some issues with that in the last release(s).

On Wed, Aug 14, 2019 at 5:31 PM Aljoscha Krettek 
wrote:

> +1
>
> I did some testing on a Google Cloud Dataproc cluster (it gives you a
> managed YARN and Google Cloud Storage (GCS)):
>   - tried both YARN session mode and YARN per-job mode, also using
> bin/flink list/cancel/etc. against a YARN session cluster
>   - ran examples that write to GCS, both with the native Hadoop FileSystem
> and a custom “plugin” FileSystem
>   - ran stateful streaming jobs that use GCS as a checkpoint backend
>   - tried running SQL programs on YARN using the SQL Cli: this worked for
> YARN session mode but not for YARN per-job mode. Looking at the code I
> don’t think per-job mode would work from seeing how it is implemented. But
> I think it’s an OK restriction to have for now
>   - in all the testing I had fine-grained recovery (region failover)
> enabled but I didn’t simulate any failures
>
> > On 14. Aug 2019, at 15:20, Kurt Young  wrote:
> >
> > Hi,
> >
> > Thanks for preparing this release candidate. I have verified the
> following:
> >
> > - verified the checksums and GPG files match the corresponding release
> files
> > - verified that the source archives do not contains any binaries
> > - build the source release with Scala 2.11 successfully.
> > - ran `mvn verify` locally, met 2 issuses [FLINK-13687] and
> [FLINK-13688],
> > but
> > both are not release blockers. Other than that, all tests are passed.
> > - ran all e2e tests which don't need download external packages (it's
> very
> > unstable
> > in China and almost impossible to download them), all passed.
> > - started local cluster, ran some examples. Met a small website display
> > issue
> > [FLINK-13591], which is also not a release blocker.
> >
> > Although we have pushed some fixes around blink planner and hive
> > integration
> > after RC2, but consider these are both preview features, I'm lean to be
> ok
> > to release
> > without these fixes.
> >
> > +1 from my side. (binding)
> >
> > Best,
> > Kurt
> >
> >
> > On Wed, Aug 14, 2019 at 5:13 PM Jark Wu  wrote:
> >
> >> Hi Gordon,
> >>
> >> I have verified the following things:
> >>
> >> - build the source release with Scala 2.12 and Scala 2.11 successfully
> >> - checked/verified signatures and hashes
> >> - checked that all POM files point to the same version
> >> - ran some flink table related end-to-end tests locally and succeeded
> >> (except TPC-H e2e failed which is reported in FLINK-13704)
> >> - started cluster for both Scala 2.11 and 2.12, ran examples, verified
> web
> >> ui and log output, nothing unexpected
> >> - started cluster, ran a SQL query to temporal join with kafka source
> and
> >> mysql jdbc table, and write results to kafka again. Using DDL to create
> the
> >> source and sinks. looks good.
> >> - reviewed the release PR
> >>
> >> As FLINK-13704 is not recognized as blocker issue, so +1 from my side
> >> (non-binding).
> >>
> >> On Tue, 13 Aug 2019 at 17:07, Till Rohrmann 
> wrote:
> >>
> >>> Hi Richard,
> >>>
> >>> although I can see that it would be handy for users who have PubSub set
> >> up,
> >>> I would rather not include examples which require an external
> dependency
> >>> into the Flink distribution. I think examples should be self-contained.
> >> My
> >>> concern is that we would bloat the distribution for many users at the
> >>> benefit of a few. Instead, I think it would be better to make these
> >>> examples available differently, maybe through Flink's ecosystem website
> >> or
> >>> maybe a new examples section in Flink's documentation.
> >>>
> >>> Cheers,
> >>> Till
> >>>
> >>> On Tue, Aug 13, 2019 at 9:43 AM Jark Wu  wrote:
> >>>
>  Hi Till,
> 
>  After thinking about we can use VARCHAR as an alternative of
>  timestamp/time/date.
>  I'm fine with not recognize it as a blocker issue.
>  We can fix it into 1.9.1.
> 
> 
>  Thanks,
>  Jark
> 
> 
>  On Tue, 13 Aug 2019 at 15:10, Richard Deurwaarder 
> >>> wrote:
> 
> > Hello all,
> >
> > I noticed the PubSub example jar is not included in the examples/ dir
> >>> of
> > flink-dist. I've created
>  https://issues.apache.org/jira/browse/FLINK-13700
> > + https://github.com/apache/flink/pull/9424/files to fix this.
> >
> > I will leave it up to you to decide if we want to add this to 1.9.0.
> >
> > Regards,
> >
> > Richard
> >
> > On Tue, Aug 13, 2019 at 9:04 AM Till Rohrmann 
> > wrote:
> >
> >> Hi Jark,
> >>
> >> thanks for reporting this issue. Could this be a documented
> >>> limitation
>  of
> >> Blink's preview version? I think we have agreed that the Blink SQL
> > planner
> >> will be rather a preview feature than production ready. Hence it
> >>> could
> >> still contain some bugs. My concern is that there 

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Aleksey Pak
Congratulations, Andrey!

On Wed, Aug 14, 2019 at 4:53 PM Markos Sfikas  wrote:

> Congrats Andrey!
>
> On Wed, 14 Aug 2019 at 16:47, Becket Qin  wrote:
>
> > Congratulations, Andrey!
> >
> > On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise  wrote:
> >
> > > Congrats!
> > >
> > >
> > > On Wed, Aug 14, 2019, 7:12 AM Robert Metzger 
> > wrote:
> > >
> > > > Congratulations! Very happy to have you onboard :)
> > > >
> > > > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas 
> > > wrote:
> > > >
> > > > > Congratulations Andrey!
> > > > > Well deserved!
> > > > >
> > > > > Kostas
> > > > >
> > > > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang  wrote:
> > > > > >
> > > > > > Congratulations Andrey.
> > > > > >
> > > > > > Best
> > > > > > Yun Tang
> > > > > > 
> > > > > > From: Xintong Song 
> > > > > > Sent: Wednesday, August 14, 2019 21:40
> > > > > > To: Oytun Tez 
> > > > > > Cc: Zili Chen ; Till Rohrmann <
> > > > > trohrm...@apache.org>; dev ; user <
> > > > > u...@flink.apache.org>
> > > > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer
> > > > > >
> > > > > > Congratulations Andery~!
> > > > > >
> > > > > > Thank you~
> > > > > >
> > > > > > Xintong Song
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez 
> > > wrote:
> > > > > >
> > > > > > Congratulations Andrey!
> > > > > >
> > > > > > I am glad the Flink committer team is growing at such a pace!
> > > > > >
> > > > > > ---
> > > > > > Oytun Tez
> > > > > >
> > > > > > M O T A W O R D
> > > > > > The World's Fastest Human Translation Platform.
> > > > > > oy...@motaword.com — www.motaword.com
> > > > > >
> > > > > >
> > > > > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen 
> > > > wrote:
> > > > > >
> > > > > > Congratulations Andrey!
> > > > > >
> > > > > > Best,
> > > > > > tison.
> > > > > >
> > > > > >
> > > > > > Till Rohrmann  于2019年8月14日周三 下午9:26写道:
> > > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > I'm very happy to announce that Andrey Zagrebin accepted the
> offer
> > of
> > > > > the Flink PMC to become a committer of the Flink project.
> > > > > >
> > > > > > Andrey has been an active community member for more than 15
> months.
> > > He
> > > > > has helped shaping numerous features such as State TTL, FRocksDB
> > > release,
> > > > > Shuffle service abstraction, FLIP-1, result partition management
> and
> > > > > various fixes/improvements. He's also frequently helping out on the
> > > > > user@f.a.o mailing lists.
> > > > > >
> > > > > > Congratulations Andrey!
> > > > > >
> > > > > > Best, Till
> > > > > > (on behalf of the Flink PMC)
> > > > >
> > > >
> > >
> >
>
>
> --
> Markos Sfikas
> +49 (0) 15759630002
>


Re: [DISCUSS] FLIP-52: Remove legacy Program interface.

2019-08-14 Thread Aljoscha Krettek
+1 (for the same reasons I posted on the other thread)

> On 14. Aug 2019, at 15:03, Zili Chen  wrote:
> 
> +1
> 
> It could be regarded as part of Flink client api refactor.
> Removal of stale code paths helps reason refactor.
> 
> There is one thing worth attention that in this thread[1] Thomas
> suggests an interface with a method return JobGraph based on the
> fact that REST API and in per job mode actually extracts the JobGraph
> from user program and submit it instead of running user program and
> submission happens inside the program in session scenario.
> 
> Such an interface would be like
> 
> interface Program {
>  JobGraph getJobGraph(args);
> }
> 
> Anyway, the discussion above could be continued in that thread.
> Current Program is a legacy class that quite less useful than it should be.
> 
> Best,
> tison.
> 
> [1]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/REST-API-JarRunHandler-More-flexibility-for-launching-jobs-td31026.html#a31168
> 
> 
> Stephan Ewen  于2019年8月14日周三 下午7:50写道:
> 
>> +1
>> 
>> the "main" method is the overwhelming default. getting rid of "two ways to
>> do things" is a good idea.
>> 
>> On Wed, Aug 14, 2019 at 1:42 PM Kostas Kloudas  wrote:
>> 
>>> Hi all,
>>> 
>>> As discussed in [1] , the Program interface seems to be outdated and
>>> there seems to be
>>> no objection to remove it.
>>> 
>>> Given that this interface is PublicEvolving, its removal should pass
>>> through a FLIP and
>>> this discussion and the associated FLIP are exactly for that purpose.
>>> 
>>> Please let me know what you think and if it is ok to proceed with its
>>> removal.
>>> 
>>> Cheers,
>>> Kostas
>>> 
>>> link to FLIP-52 :
>>> 
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=125308637
>>> 
>>> [1]
>>> 
>> https://lists.apache.org/x/thread.html/7ffc9936a384b891dbcf0a481d26c6d13b2125607c200577780d1e18@%3Cdev.flink.apache.org%3E
>>> 
>> 



Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-14 Thread Aljoscha Krettek
+1

I did some testing on a Google Cloud Dataproc cluster (it gives you a managed 
YARN and Google Cloud Storage (GCS)):
  - tried both YARN session mode and YARN per-job mode, also using bin/flink 
list/cancel/etc. against a YARN session cluster
  - ran examples that write to GCS, both with the native Hadoop FileSystem and 
a custom “plugin” FileSystem
  - ran stateful streaming jobs that use GCS as a checkpoint backend
  - tried running SQL programs on YARN using the SQL Cli: this worked for YARN 
session mode but not for YARN per-job mode. Looking at the code I don’t think 
per-job mode would work from seeing how it is implemented. But I think it’s an 
OK restriction to have for now
  - in all the testing I had fine-grained recovery (region failover) enabled 
but I didn’t simulate any failures

> On 14. Aug 2019, at 15:20, Kurt Young  wrote:
> 
> Hi,
> 
> Thanks for preparing this release candidate. I have verified the following:
> 
> - verified the checksums and GPG files match the corresponding release files
> - verified that the source archives do not contains any binaries
> - build the source release with Scala 2.11 successfully.
> - ran `mvn verify` locally, met 2 issuses [FLINK-13687] and [FLINK-13688],
> but
> both are not release blockers. Other than that, all tests are passed.
> - ran all e2e tests which don't need download external packages (it's very
> unstable
> in China and almost impossible to download them), all passed.
> - started local cluster, ran some examples. Met a small website display
> issue
> [FLINK-13591], which is also not a release blocker.
> 
> Although we have pushed some fixes around blink planner and hive
> integration
> after RC2, but consider these are both preview features, I'm lean to be ok
> to release
> without these fixes.
> 
> +1 from my side. (binding)
> 
> Best,
> Kurt
> 
> 
> On Wed, Aug 14, 2019 at 5:13 PM Jark Wu  wrote:
> 
>> Hi Gordon,
>> 
>> I have verified the following things:
>> 
>> - build the source release with Scala 2.12 and Scala 2.11 successfully
>> - checked/verified signatures and hashes
>> - checked that all POM files point to the same version
>> - ran some flink table related end-to-end tests locally and succeeded
>> (except TPC-H e2e failed which is reported in FLINK-13704)
>> - started cluster for both Scala 2.11 and 2.12, ran examples, verified web
>> ui and log output, nothing unexpected
>> - started cluster, ran a SQL query to temporal join with kafka source and
>> mysql jdbc table, and write results to kafka again. Using DDL to create the
>> source and sinks. looks good.
>> - reviewed the release PR
>> 
>> As FLINK-13704 is not recognized as blocker issue, so +1 from my side
>> (non-binding).
>> 
>> On Tue, 13 Aug 2019 at 17:07, Till Rohrmann  wrote:
>> 
>>> Hi Richard,
>>> 
>>> although I can see that it would be handy for users who have PubSub set
>> up,
>>> I would rather not include examples which require an external dependency
>>> into the Flink distribution. I think examples should be self-contained.
>> My
>>> concern is that we would bloat the distribution for many users at the
>>> benefit of a few. Instead, I think it would be better to make these
>>> examples available differently, maybe through Flink's ecosystem website
>> or
>>> maybe a new examples section in Flink's documentation.
>>> 
>>> Cheers,
>>> Till
>>> 
>>> On Tue, Aug 13, 2019 at 9:43 AM Jark Wu  wrote:
>>> 
 Hi Till,
 
 After thinking about we can use VARCHAR as an alternative of
 timestamp/time/date.
 I'm fine with not recognize it as a blocker issue.
 We can fix it into 1.9.1.
 
 
 Thanks,
 Jark
 
 
 On Tue, 13 Aug 2019 at 15:10, Richard Deurwaarder 
>>> wrote:
 
> Hello all,
> 
> I noticed the PubSub example jar is not included in the examples/ dir
>>> of
> flink-dist. I've created
 https://issues.apache.org/jira/browse/FLINK-13700
> + https://github.com/apache/flink/pull/9424/files to fix this.
> 
> I will leave it up to you to decide if we want to add this to 1.9.0.
> 
> Regards,
> 
> Richard
> 
> On Tue, Aug 13, 2019 at 9:04 AM Till Rohrmann 
> wrote:
> 
>> Hi Jark,
>> 
>> thanks for reporting this issue. Could this be a documented
>>> limitation
 of
>> Blink's preview version? I think we have agreed that the Blink SQL
> planner
>> will be rather a preview feature than production ready. Hence it
>>> could
>> still contain some bugs. My concern is that there might be still
>>> other
>> issues which we'll discover bit by bit and could postpone the
>> release
> even
>> further if we say Blink bugs are blockers.
>> 
>> Cheers,
>> Till
>> 
>> On Tue, Aug 13, 2019 at 7:42 AM Jark Wu  wrote:
>> 
>>> Hi all,
>>> 
>>> I just find an issue when testing connector DDLs against blink
 planner
>> for
>>> rc2.
>>> This issue lead to the DDL doesn't work when 

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Piotr Nowojski
Congratulations! :)

> On 14 Aug 2019, at 16:52, Markos Sfikas  wrote:
> 
> Congrats Andrey!
> 
> On Wed, 14 Aug 2019 at 16:47, Becket Qin  wrote:
> 
>> Congratulations, Andrey!
>> 
>> On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise  wrote:
>> 
>>> Congrats!
>>> 
>>> 
>>> On Wed, Aug 14, 2019, 7:12 AM Robert Metzger 
>> wrote:
>>> 
 Congratulations! Very happy to have you onboard :)
 
 On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas 
>>> wrote:
 
> Congratulations Andrey!
> Well deserved!
> 
> Kostas
> 
> On Wed, Aug 14, 2019 at 4:04 PM Yun Tang  wrote:
>> 
>> Congratulations Andrey.
>> 
>> Best
>> Yun Tang
>> 
>> From: Xintong Song 
>> Sent: Wednesday, August 14, 2019 21:40
>> To: Oytun Tez 
>> Cc: Zili Chen ; Till Rohrmann <
> trohrm...@apache.org>; dev ; user <
> u...@flink.apache.org>
>> Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer
>> 
>> Congratulations Andery~!
>> 
>> Thank you~
>> 
>> Xintong Song
>> 
>> 
>> 
>> On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez 
>>> wrote:
>> 
>> Congratulations Andrey!
>> 
>> I am glad the Flink committer team is growing at such a pace!
>> 
>> ---
>> Oytun Tez
>> 
>> M O T A W O R D
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>> 
>> 
>> On Wed, Aug 14, 2019 at 9:29 AM Zili Chen 
 wrote:
>> 
>> Congratulations Andrey!
>> 
>> Best,
>> tison.
>> 
>> 
>> Till Rohrmann  于2019年8月14日周三 下午9:26写道:
>> 
>> Hi everyone,
>> 
>> I'm very happy to announce that Andrey Zagrebin accepted the offer
>> of
> the Flink PMC to become a committer of the Flink project.
>> 
>> Andrey has been an active community member for more than 15 months.
>>> He
> has helped shaping numerous features such as State TTL, FRocksDB
>>> release,
> Shuffle service abstraction, FLIP-1, result partition management and
> various fixes/improvements. He's also frequently helping out on the
> user@f.a.o mailing lists.
>> 
>> Congratulations Andrey!
>> 
>> Best, Till
>> (on behalf of the Flink PMC)
> 
 
>>> 
>> 
> 
> 
> -- 
> Markos Sfikas
> +49 (0) 15759630002



[jira] [Created] (FLINK-13721) BroadcastState should support StateTTL

2019-08-14 Thread JIRA
Kerem Ulutaş created FLINK-13721:


 Summary: BroadcastState should support StateTTL
 Key: FLINK-13721
 URL: https://issues.apache.org/jira/browse/FLINK-13721
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream, Runtime / Queryable State
Affects Versions: 1.8.1
 Environment: MacOS 10.14.6 running IntelliJ Idea Ultimate 2019.2, 
Flink version 1.8.1
Reporter: Kerem Ulutaş
 Attachments: DebugBroadcastStateTTL.java, IntHolder.java, 
StringHolder.java, flink_broadcast_state_ttl_debug.log

Hi everyone,

Sorry if I'm doing anything wrong, this is my first issue in Apache Jira.

I have a use case which requires 2 data streams to be cross joined. To be 
exact, one stream is location updates from clients and the other stream is 
event data with location information. I'm trying to get events that happen 
within a certain radius of client location(s).

Since the streams can not be related to each other by using a common key for 
partitioning, I have to broadcast client updates to all tasks and evaluate the 
radius check for each event.

The requirement here is, if we don't receive any location updates from a client 
for a certain amount of time, we should consider the client is "gone" (similar 
to the rationale stated in FLINK-3089 description: 
https://issues.apache.org/jira/browse/FLINK-3089)

I have attached the sample application classes I used to debug BroadcastState 
and StateTTL together.

The output (see flink_broadcast_state_ttl_debug.log) shows that client "c0" got 
its first event at 17:08:07.67 (expected to be evicted sometime after 
17:08:37.xxx) but doesn't get evicted.

For the operator part (which is the result of 
BroadcastConnectedStream.process) - since context in 
onTimer method doesn't let user to change contents of the broadcast state, only 
way to deal with stale client data is as follows:
 * In processElement method, calculate result for client data which is newer 
than the ttl
 * In processBroadcastElement method, remove client data older than a certain 
amount of time from the broadcast state

If broadcast side of the connected streams doesn't get data for longer than the 
desired time-to-live amount of time, BroadcastState will hold stale data and 
processElement method would have to filter those client data each time. This is 
the method I am using in production right now.

I am not aware of any decision or limitation that makes it not possible to 
implement StateTTL for BroadcastState, I will be pleased if someone explains if 
there are any.

Thanks and regards.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: Checkpointing under backpressure

2019-08-14 Thread Piotr Nowojski
> Thanks for the great ideas so far.

+1

Regarding other things raised, I mostly agree with Stephan. 

I like the idea of simultaneously starting the checkpoint everywhere via RPC 
call (especially in cases where Tasks are busy doing some synchronous 
operations for example for tens of milliseconds. In that case every network 
exchange adds tens of milliseconds of delay in propagating the checkpoint). 
However I agree that this might be a premature optimisation assuming the 
current state of our code (we already have checkpoint barriers).

However I like the idea of switching from:

1. A -> S -> F (Align -> snapshot -> forward markers)

To

2. S -> F -> L (Snapshot -> forward markers -> log pending channels)

Or even to

6. F -> S -> L (Forward markers -> snapshot -> log pending channels)

It feels to me like this would decouple propagation of checkpoints from costs 
of synchronous snapshots and waiting for all of the checkpoint barriers to 
arrive (even if they will overtake in-flight records, this might take some 
time).

> What I like about the Chandy Lamport approach (2.) initiated from sources is 
> that:
>   - Snapshotting imposes no modification to normal processing. 

Yes, I agree that would be nice. Currently, during the alignment and blocking 
of the input channels, we might be wasting CPU cycles of up stream tasks. If we 
succeed in designing new checkpointing mechanism to not disrupt/block regular 
data processing (% the extra IO cost for logging the in-flight records), that 
would be a huge improvement.

Piotrek

> On 14 Aug 2019, at 14:56, Paris Carbone  wrote:
> 
> Sure I see. In cases when no periodic aligned snapshots are employed this is 
> the only option.
> 
> Two things that were not highlighted enough so far on the proposed protocol 
> (included my mails):
>   - The Recovery/Reconfiguration strategy should strictly prioritise 
> processing logged events before entering normal task input operation. 
> Otherwise causality can be violated. This also means dataflow recovery will 
> be expected to be slower to the one employed on an aligned snapshot.
>   - Same as with state capture, markers should be forwarded upon first 
> marker received on input. No later than that. Otherwise we have duplicate 
> side effects.
> 
> Thanks for the great ideas so far.
> 
> Paris
> 
>> On 14 Aug 2019, at 14:33, Stephan Ewen  wrote:
>> 
>> Scaling with unaligned checkpoints might be a necessity.
>> 
>> Let's assume the job failed due to a lost TaskManager, but no new
>> TaskManager becomes available.
>> In that case we need to scale down based on the latest complete checkpoint,
>> because we cannot produce a new checkpoint.
>> 
>> 
>> On Wed, Aug 14, 2019 at 2:05 PM Paris Carbone 
>> wrote:
>> 
>>> +1 I think we are on the same page Stephan.
>>> 
>>> Rescaling on unaligned checkpoint sounds challenging and a bit
>>> unnecessary. No?
>>> Why not sticking to aligned snapshots for live reconfiguration/rescaling?
>>> It’s a pretty rare operation and it would simplify things by a lot.
>>> Everything can be “staged” upon alignment including replacing channels and
>>> tasks.
>>> 
>>> -Paris
>>> 
 On 14 Aug 2019, at 13:39, Stephan Ewen  wrote:
 
 Hi all!
 
 Yes, the first proposal of "unaligend checkpoints" (probably two years
>>> back
 now) drew a major inspiration from Chandy Lamport, as did actually the
 original checkpointing algorithm.
 
 "Logging data between first and last barrier" versus "barrier jumping
>>> over
 buffer and storing those buffers" is pretty close same.
 However, there are a few nice benefits of the proposal of unaligned
 checkpoints over Chandy-Lamport.
 
 *## Benefits of Unaligned Checkpoints*
 
 (1) It is very similar to the original algorithm (can be seen an an
 optional feature purely in the network stack) and thus can share lot's of
 code paths.
 
 (2) Less data stored. If we make the "jump over buffers" part timeout
>>> based
 (for example barrier overtakes buffers if not flushed within 10ms) then
 checkpoints are in the common case of flowing pipelines aligned without
 in-flight data. Only back pressured cases store some in-flight data,
>>> which
 means we don't regress in the common case and only fix the back pressure
 case.
 
 (3) Faster checkpoints. Chandy Lamport still waits for all barriers to
 arrive naturally, logging on the way. If data processing is slow, this
>>> can
 still take quite a while.
 
 ==> I think both these points are strong reasons to not change the
 mechanism away from "trigger sources" and start with CL-style "trigger
>>> all".
 
 
 *## Possible ways to combine Chandy Lamport and Unaligned Checkpoints*
 
 We can think about something like "take state snapshot on first barrier"
 and then store buffers until the other barriers arrive. Inside the
>>> network
 stack, barriers could still 

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Markos Sfikas
Congrats Andrey!

On Wed, 14 Aug 2019 at 16:47, Becket Qin  wrote:

> Congratulations, Andrey!
>
> On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise  wrote:
>
> > Congrats!
> >
> >
> > On Wed, Aug 14, 2019, 7:12 AM Robert Metzger 
> wrote:
> >
> > > Congratulations! Very happy to have you onboard :)
> > >
> > > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas 
> > wrote:
> > >
> > > > Congratulations Andrey!
> > > > Well deserved!
> > > >
> > > > Kostas
> > > >
> > > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang  wrote:
> > > > >
> > > > > Congratulations Andrey.
> > > > >
> > > > > Best
> > > > > Yun Tang
> > > > > 
> > > > > From: Xintong Song 
> > > > > Sent: Wednesday, August 14, 2019 21:40
> > > > > To: Oytun Tez 
> > > > > Cc: Zili Chen ; Till Rohrmann <
> > > > trohrm...@apache.org>; dev ; user <
> > > > u...@flink.apache.org>
> > > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer
> > > > >
> > > > > Congratulations Andery~!
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez 
> > wrote:
> > > > >
> > > > > Congratulations Andrey!
> > > > >
> > > > > I am glad the Flink committer team is growing at such a pace!
> > > > >
> > > > > ---
> > > > > Oytun Tez
> > > > >
> > > > > M O T A W O R D
> > > > > The World's Fastest Human Translation Platform.
> > > > > oy...@motaword.com — www.motaword.com
> > > > >
> > > > >
> > > > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen 
> > > wrote:
> > > > >
> > > > > Congratulations Andrey!
> > > > >
> > > > > Best,
> > > > > tison.
> > > > >
> > > > >
> > > > > Till Rohrmann  于2019年8月14日周三 下午9:26写道:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > I'm very happy to announce that Andrey Zagrebin accepted the offer
> of
> > > > the Flink PMC to become a committer of the Flink project.
> > > > >
> > > > > Andrey has been an active community member for more than 15 months.
> > He
> > > > has helped shaping numerous features such as State TTL, FRocksDB
> > release,
> > > > Shuffle service abstraction, FLIP-1, result partition management and
> > > > various fixes/improvements. He's also frequently helping out on the
> > > > user@f.a.o mailing lists.
> > > > >
> > > > > Congratulations Andrey!
> > > > >
> > > > > Best, Till
> > > > > (on behalf of the Flink PMC)
> > > >
> > >
> >
>


-- 
Markos Sfikas
+49 (0) 15759630002


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Becket Qin
Congratulations, Andrey!

On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise  wrote:

> Congrats!
>
>
> On Wed, Aug 14, 2019, 7:12 AM Robert Metzger  wrote:
>
> > Congratulations! Very happy to have you onboard :)
> >
> > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas 
> wrote:
> >
> > > Congratulations Andrey!
> > > Well deserved!
> > >
> > > Kostas
> > >
> > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang  wrote:
> > > >
> > > > Congratulations Andrey.
> > > >
> > > > Best
> > > > Yun Tang
> > > > 
> > > > From: Xintong Song 
> > > > Sent: Wednesday, August 14, 2019 21:40
> > > > To: Oytun Tez 
> > > > Cc: Zili Chen ; Till Rohrmann <
> > > trohrm...@apache.org>; dev ; user <
> > > u...@flink.apache.org>
> > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer
> > > >
> > > > Congratulations Andery~!
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez 
> wrote:
> > > >
> > > > Congratulations Andrey!
> > > >
> > > > I am glad the Flink committer team is growing at such a pace!
> > > >
> > > > ---
> > > > Oytun Tez
> > > >
> > > > M O T A W O R D
> > > > The World's Fastest Human Translation Platform.
> > > > oy...@motaword.com — www.motaword.com
> > > >
> > > >
> > > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen 
> > wrote:
> > > >
> > > > Congratulations Andrey!
> > > >
> > > > Best,
> > > > tison.
> > > >
> > > >
> > > > Till Rohrmann  于2019年8月14日周三 下午9:26写道:
> > > >
> > > > Hi everyone,
> > > >
> > > > I'm very happy to announce that Andrey Zagrebin accepted the offer of
> > > the Flink PMC to become a committer of the Flink project.
> > > >
> > > > Andrey has been an active community member for more than 15 months.
> He
> > > has helped shaping numerous features such as State TTL, FRocksDB
> release,
> > > Shuffle service abstraction, FLIP-1, result partition management and
> > > various fixes/improvements. He's also frequently helping out on the
> > > user@f.a.o mailing lists.
> > > >
> > > > Congratulations Andrey!
> > > >
> > > > Best, Till
> > > > (on behalf of the Flink PMC)
> > >
> >
>


Watermarks not propagated to WebUI?

2019-08-14 Thread Jan Lukavský

Hi,

is it possible, that watermarks are sometimes not propagated to WebUI, 
although they are internally moving as normal? I see in WebUI every 
operator showing "No Watermark", but outputs seem to be propagated to 
sink (and there are watermark sensitive operations involved - e.g. 
reductions on fixed windows without early emitting). More strangely, 
this happens when I increase parallelism above some threshold. If I use 
parallelism of N, watermarks are shown, when I increase it above some 
number (seems not to be exactly deterministic), watermarks seems to 
disappear.


I'm using Flink 1.8.1.

Did anyone experience something like this before?

Jan



Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Thomas Weise
Congrats!


On Wed, Aug 14, 2019, 7:12 AM Robert Metzger  wrote:

> Congratulations! Very happy to have you onboard :)
>
> On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas  wrote:
>
> > Congratulations Andrey!
> > Well deserved!
> >
> > Kostas
> >
> > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang  wrote:
> > >
> > > Congratulations Andrey.
> > >
> > > Best
> > > Yun Tang
> > > 
> > > From: Xintong Song 
> > > Sent: Wednesday, August 14, 2019 21:40
> > > To: Oytun Tez 
> > > Cc: Zili Chen ; Till Rohrmann <
> > trohrm...@apache.org>; dev ; user <
> > u...@flink.apache.org>
> > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer
> > >
> > > Congratulations Andery~!
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez  wrote:
> > >
> > > Congratulations Andrey!
> > >
> > > I am glad the Flink committer team is growing at such a pace!
> > >
> > > ---
> > > Oytun Tez
> > >
> > > M O T A W O R D
> > > The World's Fastest Human Translation Platform.
> > > oy...@motaword.com — www.motaword.com
> > >
> > >
> > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen 
> wrote:
> > >
> > > Congratulations Andrey!
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Till Rohrmann  于2019年8月14日周三 下午9:26写道:
> > >
> > > Hi everyone,
> > >
> > > I'm very happy to announce that Andrey Zagrebin accepted the offer of
> > the Flink PMC to become a committer of the Flink project.
> > >
> > > Andrey has been an active community member for more than 15 months. He
> > has helped shaping numerous features such as State TTL, FRocksDB release,
> > Shuffle service abstraction, FLIP-1, result partition management and
> > various fixes/improvements. He's also frequently helping out on the
> > user@f.a.o mailing lists.
> > >
> > > Congratulations Andrey!
> > >
> > > Best, Till
> > > (on behalf of the Flink PMC)
> >
>


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Robert Metzger
Congratulations! Very happy to have you onboard :)

On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas  wrote:

> Congratulations Andrey!
> Well deserved!
>
> Kostas
>
> On Wed, Aug 14, 2019 at 4:04 PM Yun Tang  wrote:
> >
> > Congratulations Andrey.
> >
> > Best
> > Yun Tang
> > 
> > From: Xintong Song 
> > Sent: Wednesday, August 14, 2019 21:40
> > To: Oytun Tez 
> > Cc: Zili Chen ; Till Rohrmann <
> trohrm...@apache.org>; dev ; user <
> u...@flink.apache.org>
> > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer
> >
> > Congratulations Andery~!
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez  wrote:
> >
> > Congratulations Andrey!
> >
> > I am glad the Flink committer team is growing at such a pace!
> >
> > ---
> > Oytun Tez
> >
> > M O T A W O R D
> > The World's Fastest Human Translation Platform.
> > oy...@motaword.com — www.motaword.com
> >
> >
> > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen  wrote:
> >
> > Congratulations Andrey!
> >
> > Best,
> > tison.
> >
> >
> > Till Rohrmann  于2019年8月14日周三 下午9:26写道:
> >
> > Hi everyone,
> >
> > I'm very happy to announce that Andrey Zagrebin accepted the offer of
> the Flink PMC to become a committer of the Flink project.
> >
> > Andrey has been an active community member for more than 15 months. He
> has helped shaping numerous features such as State TTL, FRocksDB release,
> Shuffle service abstraction, FLIP-1, result partition management and
> various fixes/improvements. He's also frequently helping out on the
> user@f.a.o mailing lists.
> >
> > Congratulations Andrey!
> >
> > Best, Till
> > (on behalf of the Flink PMC)
>


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Oytun Tez
Congratulations Andrey!

I am glad the Flink committer team is growing at such a pace!

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Wed, Aug 14, 2019 at 9:29 AM Zili Chen  wrote:

> Congratulations Andrey!
>
> Best,
> tison.
>
>
> Till Rohrmann  于2019年8月14日周三 下午9:26写道:
>
>> Hi everyone,
>>
>> I'm very happy to announce that Andrey Zagrebin accepted the offer of
>> the Flink PMC to become a committer of the Flink project.
>>
>> Andrey has been an active community member for more than 15 months. He
>> has helped shaping numerous features such as State TTL, FRocksDB release,
>> Shuffle service abstraction, FLIP-1, result partition management and
>> various fixes/improvements. He's also frequently helping out on the
>> user@f.a.o mailing lists.
>>
>> Congratulations Andrey!
>>
>> Best, Till
>> (on behalf of the Flink PMC)
>>
>


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Kostas Kloudas
Congratulations Andrey!
Well deserved!

Kostas

On Wed, Aug 14, 2019 at 4:04 PM Yun Tang  wrote:
>
> Congratulations Andrey.
>
> Best
> Yun Tang
> 
> From: Xintong Song 
> Sent: Wednesday, August 14, 2019 21:40
> To: Oytun Tez 
> Cc: Zili Chen ; Till Rohrmann ; 
> dev ; user 
> Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer
>
> Congratulations Andery~!
>
> Thank you~
>
> Xintong Song
>
>
>
> On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez  wrote:
>
> Congratulations Andrey!
>
> I am glad the Flink committer team is growing at such a pace!
>
> ---
> Oytun Tez
>
> M O T A W O R D
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Wed, Aug 14, 2019 at 9:29 AM Zili Chen  wrote:
>
> Congratulations Andrey!
>
> Best,
> tison.
>
>
> Till Rohrmann  于2019年8月14日周三 下午9:26写道:
>
> Hi everyone,
>
> I'm very happy to announce that Andrey Zagrebin accepted the offer of the 
> Flink PMC to become a committer of the Flink project.
>
> Andrey has been an active community member for more than 15 months. He has 
> helped shaping numerous features such as State TTL, FRocksDB release, Shuffle 
> service abstraction, FLIP-1, result partition management and various 
> fixes/improvements. He's also frequently helping out on the user@f.a.o 
> mailing lists.
>
> Congratulations Andrey!
>
> Best, Till
> (on behalf of the Flink PMC)


[jira] [Created] (FLINK-13720) Include asm-commons in flink-shaded-asm

2019-08-14 Thread Tzu-Li (Gordon) Tai (JIRA)
Tzu-Li (Gordon) Tai created FLINK-13720:
---

 Summary: Include asm-commons in flink-shaded-asm
 Key: FLINK-13720
 URL: https://issues.apache.org/jira/browse/FLINK-13720
 Project: Flink
  Issue Type: Wish
  Components: BuildSystem / Shaded
Reporter: Tzu-Li (Gordon) Tai


{{asm-commons}} was included before in flink-shaded version 6.0, but was later 
removed in 7.0 when we upgraded to asm 6.2.1.

We bumped into a use case in tests which requires {{ClassRelocator}}, which is 
part of {{asm-commons}}.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Xintong Song
Congratulations Andery~!

Thank you~

Xintong Song



On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez  wrote:

> Congratulations Andrey!
>
> I am glad the Flink committer team is growing at such a pace!
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Wed, Aug 14, 2019 at 9:29 AM Zili Chen  wrote:
>
>> Congratulations Andrey!
>>
>> Best,
>> tison.
>>
>>
>> Till Rohrmann  于2019年8月14日周三 下午9:26写道:
>>
>>> Hi everyone,
>>>
>>> I'm very happy to announce that Andrey Zagrebin accepted the offer of
>>> the Flink PMC to become a committer of the Flink project.
>>>
>>> Andrey has been an active community member for more than 15 months. He
>>> has helped shaping numerous features such as State TTL, FRocksDB release,
>>> Shuffle service abstraction, FLIP-1, result partition management and
>>> various fixes/improvements. He's also frequently helping out on the
>>> user@f.a.o mailing lists.
>>>
>>> Congratulations Andrey!
>>>
>>> Best, Till
>>> (on behalf of the Flink PMC)
>>>
>>


[jira] [Created] (FLINK-13719) Update Yarn E2E test docker image to run on Java 11

2019-08-14 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-13719:


 Summary: Update Yarn E2E test docker image to run on Java 11
 Key: FLINK-13719
 URL: https://issues.apache.org/jira/browse/FLINK-13719
 Project: Flink
  Issue Type: Sub-task
  Components: Tests
Reporter: Chesnay Schepler
 Fix For: 1.10.0


The docker image used in the e2e tests is using Java 8. We have to setup 
another build that uses Java 11.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13718) Disable HBase tests

2019-08-14 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-13718:


 Summary: Disable HBase tests
 Key: FLINK-13718
 URL: https://issues.apache.org/jira/browse/FLINK-13718
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / HBase, Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.10.0


The HBase tests are categorically failing on Java 11. Given that HBase itself 
does not support Java 11 at this point we should just disable these tests for 
the time being.

{code}
HBaseConnectorITCase.activateHBaseCluster:81->HBaseTestingClusterAutostarter.registerHBaseMiniClusterInClasspath:189->HBaseTestingClusterAutostarter.addDirectoryToClassPath:224
 We should get a URLClassLoader
  
HBaseLookupFunctionITCase.activateHBaseCluster:95->HBaseTestingClusterAutostarter.registerHBaseMiniClusterInClasspath:189->HBaseTestingClusterAutostarter.addDirectoryToClassPath:224
 We should get a URLClassLoader
  
HBaseSinkITCase.activateHBaseCluster:91->HBaseTestingClusterAutostarter.registerHBaseMiniClusterInClasspath:189->HBaseTestingClusterAutostarter.addDirectoryToClassPath:224
 We should get a URLClassLoader
{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Zili Chen
Congratulations Andrey!

Best,
tison.


Till Rohrmann  于2019年8月14日周三 下午9:26写道:

> Hi everyone,
>
> I'm very happy to announce that Andrey Zagrebin accepted the offer of the
> Flink PMC to become a committer of the Flink project.
>
> Andrey has been an active community member for more than 15 months. He
> has helped shaping numerous features such as State TTL, FRocksDB release,
> Shuffle service abstraction, FLIP-1, result partition management and
> various fixes/improvements. He's also frequently helping out on the
> user@f.a.o mailing lists.
>
> Congratulations Andrey!
>
> Best, Till
> (on behalf of the Flink PMC)
>


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread zhijiang
Congratulations Andrey, great work and well deserved!

Best,
Zhijiang
--
From:Till Rohrmann 
Send Time:2019年8月14日(星期三) 15:26
To:dev ; user 
Subject:[ANNOUNCE] Andrey Zagrebin becomes a Flink committer

Hi everyone,

I'm very happy to announce that Andrey Zagrebin accepted the offer of the Flink 
PMC to become a committer of the Flink project.

Andrey has been an active community member for more than 15 months. He has 
helped shaping numerous features such as State TTL, FRocksDB release, Shuffle 
service abstraction, FLIP-1, result partition management and various 
fixes/improvements. He's also frequently helping out on the user@f.a.o mailing 
lists.

Congratulations Andrey!

Best, Till 
(on behalf of the Flink PMC)



[ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Till Rohrmann
Hi everyone,

I'm very happy to announce that Andrey Zagrebin accepted the offer of the
Flink PMC to become a committer of the Flink project.

Andrey has been an active community member for more than 15 months. He has
helped shaping numerous features such as State TTL, FRocksDB release,
Shuffle service abstraction, FLIP-1, result partition management and
various fixes/improvements. He's also frequently helping out on the
user@f.a.o mailing lists.

Congratulations Andrey!

Best, Till
(on behalf of the Flink PMC)


Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-14 Thread Kurt Young
Hi,

Thanks for preparing this release candidate. I have verified the following:

- verified the checksums and GPG files match the corresponding release files
- verified that the source archives do not contains any binaries
- build the source release with Scala 2.11 successfully.
- ran `mvn verify` locally, met 2 issuses [FLINK-13687] and [FLINK-13688],
but
both are not release blockers. Other than that, all tests are passed.
- ran all e2e tests which don't need download external packages (it's very
unstable
in China and almost impossible to download them), all passed.
- started local cluster, ran some examples. Met a small website display
issue
[FLINK-13591], which is also not a release blocker.

Although we have pushed some fixes around blink planner and hive
integration
after RC2, but consider these are both preview features, I'm lean to be ok
to release
without these fixes.

+1 from my side. (binding)

Best,
Kurt


On Wed, Aug 14, 2019 at 5:13 PM Jark Wu  wrote:

> Hi Gordon,
>
> I have verified the following things:
>
> - build the source release with Scala 2.12 and Scala 2.11 successfully
> - checked/verified signatures and hashes
> - checked that all POM files point to the same version
> - ran some flink table related end-to-end tests locally and succeeded
> (except TPC-H e2e failed which is reported in FLINK-13704)
> - started cluster for both Scala 2.11 and 2.12, ran examples, verified web
> ui and log output, nothing unexpected
> - started cluster, ran a SQL query to temporal join with kafka source and
> mysql jdbc table, and write results to kafka again. Using DDL to create the
> source and sinks. looks good.
> - reviewed the release PR
>
> As FLINK-13704 is not recognized as blocker issue, so +1 from my side
> (non-binding).
>
> On Tue, 13 Aug 2019 at 17:07, Till Rohrmann  wrote:
>
> > Hi Richard,
> >
> > although I can see that it would be handy for users who have PubSub set
> up,
> > I would rather not include examples which require an external dependency
> > into the Flink distribution. I think examples should be self-contained.
> My
> > concern is that we would bloat the distribution for many users at the
> > benefit of a few. Instead, I think it would be better to make these
> > examples available differently, maybe through Flink's ecosystem website
> or
> > maybe a new examples section in Flink's documentation.
> >
> > Cheers,
> > Till
> >
> > On Tue, Aug 13, 2019 at 9:43 AM Jark Wu  wrote:
> >
> > > Hi Till,
> > >
> > > After thinking about we can use VARCHAR as an alternative of
> > > timestamp/time/date.
> > > I'm fine with not recognize it as a blocker issue.
> > > We can fix it into 1.9.1.
> > >
> > >
> > > Thanks,
> > > Jark
> > >
> > >
> > > On Tue, 13 Aug 2019 at 15:10, Richard Deurwaarder 
> > wrote:
> > >
> > > > Hello all,
> > > >
> > > > I noticed the PubSub example jar is not included in the examples/ dir
> > of
> > > > flink-dist. I've created
> > > https://issues.apache.org/jira/browse/FLINK-13700
> > > >  + https://github.com/apache/flink/pull/9424/files to fix this.
> > > >
> > > > I will leave it up to you to decide if we want to add this to 1.9.0.
> > > >
> > > > Regards,
> > > >
> > > > Richard
> > > >
> > > > On Tue, Aug 13, 2019 at 9:04 AM Till Rohrmann 
> > > > wrote:
> > > >
> > > > > Hi Jark,
> > > > >
> > > > > thanks for reporting this issue. Could this be a documented
> > limitation
> > > of
> > > > > Blink's preview version? I think we have agreed that the Blink SQL
> > > > planner
> > > > > will be rather a preview feature than production ready. Hence it
> > could
> > > > > still contain some bugs. My concern is that there might be still
> > other
> > > > > issues which we'll discover bit by bit and could postpone the
> release
> > > > even
> > > > > further if we say Blink bugs are blockers.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Tue, Aug 13, 2019 at 7:42 AM Jark Wu  wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I just find an issue when testing connector DDLs against blink
> > > planner
> > > > > for
> > > > > > rc2.
> > > > > > This issue lead to the DDL doesn't work when containing
> > > > > timestamp/date/time
> > > > > > type.
> > > > > > I have created an issue FLINK-13699[1] and a pull request for
> this.
> > > > > >
> > > > > > IMO, this can be a blocker issue of 1.9 release. Because
> > > > > > timestamp/date/time are primitive types, and this will break the
> > DDL
> > > > > > feature.
> > > > > > However, I want to hear more thoughts from the community whether
> we
> > > > > should
> > > > > > recognize it as a blocker.
> > > > > >
> > > > > > Thanks,
> > > > > > Jark
> > > > > >
> > > > > >
> > > > > > [1]: https://issues.apache.org/jira/browse/FLINK-13699
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, 12 Aug 2019 at 22:46, Becket Qin 
> > > wrote:
> > > > > >
> > > > > > > Thanks Gordon, will do that.
> > > > > > >
> > > > > > > On Mon, Aug 12, 2019 at 4:42 PM Tzu-Li (Gordon) Tai <
> > > > > 

Re: [DISCUSS] FLIP-52: Remove legacy Program interface.

2019-08-14 Thread Zili Chen
+1

It could be regarded as part of Flink client api refactor.
Removal of stale code paths helps reason refactor.

There is one thing worth attention that in this thread[1] Thomas
suggests an interface with a method return JobGraph based on the
fact that REST API and in per job mode actually extracts the JobGraph
from user program and submit it instead of running user program and
submission happens inside the program in session scenario.

Such an interface would be like

interface Program {
  JobGraph getJobGraph(args);
}

Anyway, the discussion above could be continued in that thread.
Current Program is a legacy class that quite less useful than it should be.

Best,
tison.

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/REST-API-JarRunHandler-More-flexibility-for-launching-jobs-td31026.html#a31168


Stephan Ewen  于2019年8月14日周三 下午7:50写道:

> +1
>
> the "main" method is the overwhelming default. getting rid of "two ways to
> do things" is a good idea.
>
> On Wed, Aug 14, 2019 at 1:42 PM Kostas Kloudas  wrote:
>
> > Hi all,
> >
> > As discussed in [1] , the Program interface seems to be outdated and
> > there seems to be
> > no objection to remove it.
> >
> > Given that this interface is PublicEvolving, its removal should pass
> > through a FLIP and
> > this discussion and the associated FLIP are exactly for that purpose.
> >
> > Please let me know what you think and if it is ok to proceed with its
> > removal.
> >
> > Cheers,
> > Kostas
> >
> > link to FLIP-52 :
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=125308637
> >
> > [1]
> >
> https://lists.apache.org/x/thread.html/7ffc9936a384b891dbcf0a481d26c6d13b2125607c200577780d1e18@%3Cdev.flink.apache.org%3E
> >
>


Re: [DISCUSS] FLIP-52: Remove legacy Program interface.

2019-08-14 Thread Till Rohrmann
+1

Cheers,
Till

On Wed, Aug 14, 2019 at 1:50 PM Stephan Ewen  wrote:

> +1
>
> the "main" method is the overwhelming default. getting rid of "two ways to
> do things" is a good idea.
>
> On Wed, Aug 14, 2019 at 1:42 PM Kostas Kloudas  wrote:
>
> > Hi all,
> >
> > As discussed in [1] , the Program interface seems to be outdated and
> > there seems to be
> > no objection to remove it.
> >
> > Given that this interface is PublicEvolving, its removal should pass
> > through a FLIP and
> > this discussion and the associated FLIP are exactly for that purpose.
> >
> > Please let me know what you think and if it is ok to proceed with its
> > removal.
> >
> > Cheers,
> > Kostas
> >
> > link to FLIP-52 :
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=125308637
> >
> > [1]
> >
> https://lists.apache.org/x/thread.html/7ffc9936a384b891dbcf0a481d26c6d13b2125607c200577780d1e18@%3Cdev.flink.apache.org%3E
> >
>


Re: Checkpointing under backpressure

2019-08-14 Thread Paris Carbone
Sure I see. In cases when no periodic aligned snapshots are employed this is 
the only option.

Two things that were not highlighted enough so far on the proposed protocol 
(included my mails):
- The Recovery/Reconfiguration strategy should strictly prioritise 
processing logged events before entering normal task input operation. Otherwise 
causality can be violated. This also means dataflow recovery will be expected 
to be slower to the one employed on an aligned snapshot.
- Same as with state capture, markers should be forwarded upon first 
marker received on input. No later than that. Otherwise we have duplicate side 
effects.

Thanks for the great ideas so far.

Paris

> On 14 Aug 2019, at 14:33, Stephan Ewen  wrote:
> 
> Scaling with unaligned checkpoints might be a necessity.
> 
> Let's assume the job failed due to a lost TaskManager, but no new
> TaskManager becomes available.
> In that case we need to scale down based on the latest complete checkpoint,
> because we cannot produce a new checkpoint.
> 
> 
> On Wed, Aug 14, 2019 at 2:05 PM Paris Carbone 
> wrote:
> 
>> +1 I think we are on the same page Stephan.
>> 
>> Rescaling on unaligned checkpoint sounds challenging and a bit
>> unnecessary. No?
>> Why not sticking to aligned snapshots for live reconfiguration/rescaling?
>> It’s a pretty rare operation and it would simplify things by a lot.
>> Everything can be “staged” upon alignment including replacing channels and
>> tasks.
>> 
>> -Paris
>> 
>>> On 14 Aug 2019, at 13:39, Stephan Ewen  wrote:
>>> 
>>> Hi all!
>>> 
>>> Yes, the first proposal of "unaligend checkpoints" (probably two years
>> back
>>> now) drew a major inspiration from Chandy Lamport, as did actually the
>>> original checkpointing algorithm.
>>> 
>>> "Logging data between first and last barrier" versus "barrier jumping
>> over
>>> buffer and storing those buffers" is pretty close same.
>>> However, there are a few nice benefits of the proposal of unaligned
>>> checkpoints over Chandy-Lamport.
>>> 
>>> *## Benefits of Unaligned Checkpoints*
>>> 
>>> (1) It is very similar to the original algorithm (can be seen an an
>>> optional feature purely in the network stack) and thus can share lot's of
>>> code paths.
>>> 
>>> (2) Less data stored. If we make the "jump over buffers" part timeout
>> based
>>> (for example barrier overtakes buffers if not flushed within 10ms) then
>>> checkpoints are in the common case of flowing pipelines aligned without
>>> in-flight data. Only back pressured cases store some in-flight data,
>> which
>>> means we don't regress in the common case and only fix the back pressure
>>> case.
>>> 
>>> (3) Faster checkpoints. Chandy Lamport still waits for all barriers to
>>> arrive naturally, logging on the way. If data processing is slow, this
>> can
>>> still take quite a while.
>>> 
>>> ==> I think both these points are strong reasons to not change the
>>> mechanism away from "trigger sources" and start with CL-style "trigger
>> all".
>>> 
>>> 
>>> *## Possible ways to combine Chandy Lamport and Unaligned Checkpoints*
>>> 
>>> We can think about something like "take state snapshot on first barrier"
>>> and then store buffers until the other barriers arrive. Inside the
>> network
>>> stack, barriers could still overtake and persist buffers.
>>> The benefit would be less latency increase in the channels which already
>>> have received barriers.
>>> However, as mentioned before, not prioritizing the inputs from which
>>> barriers are still missing can also have an adverse effect.
>>> 
>>> 
>>> *## Concerning upgrades*
>>> 
>>> I think it is a fair restriction to say that upgrades need to happen on
>>> aligned checkpoints. It is a rare enough operation.
>>> 
>>> 
>>> *## Concerning re-scaling (changing parallelism)*
>>> 
>>> We need to support that on unaligned checkpoints as well. There are
>> several
>>> feature proposals about automatic scaling, especially down scaling in
>> case
>>> of missing resources. The last snapshot might be a regular checkpoint, so
>>> all checkpoints need to support rescaling.
>>> 
>>> 
>>> *## Concerning end-to-end checkpoint duration and "trigger sources"
>> versus
>>> "trigger all"*
>>> 
>>> I think for the end-to-end checkpoint duration, an "overtake buffers"
>>> approach yields faster checkpoints, as mentioned above (Chandy Lamport
>>> logging still needs to wait for barrier to flow).
>>> 
>>> I don't see the benefit of a "trigger all tasks via RPC concurrently"
>>> approach. Bear in mind that it is still a globally coordinated approach
>> and
>>> you need to wait for the global checkpoint to complete before committing
>>> any side effects.
>>> I believe that the checkpoint time is more determined by the state
>>> checkpoint writing, and the global coordination and metadata commit, than
>>> by the difference in alignment time between "trigger from source and jump
>>> over buffers" versus "trigger all tasks concurrently".
>>> 
>>> Trying to optimize a few tens of 

Re: Checkpointing under backpressure

2019-08-14 Thread Stephan Ewen
Scaling with unaligned checkpoints might be a necessity.

Let's assume the job failed due to a lost TaskManager, but no new
TaskManager becomes available.
In that case we need to scale down based on the latest complete checkpoint,
because we cannot produce a new checkpoint.


On Wed, Aug 14, 2019 at 2:05 PM Paris Carbone 
wrote:

> +1 I think we are on the same page Stephan.
>
> Rescaling on unaligned checkpoint sounds challenging and a bit
> unnecessary. No?
> Why not sticking to aligned snapshots for live reconfiguration/rescaling?
> It’s a pretty rare operation and it would simplify things by a lot.
> Everything can be “staged” upon alignment including replacing channels and
> tasks.
>
> -Paris
>
> > On 14 Aug 2019, at 13:39, Stephan Ewen  wrote:
> >
> > Hi all!
> >
> > Yes, the first proposal of "unaligend checkpoints" (probably two years
> back
> > now) drew a major inspiration from Chandy Lamport, as did actually the
> > original checkpointing algorithm.
> >
> > "Logging data between first and last barrier" versus "barrier jumping
> over
> > buffer and storing those buffers" is pretty close same.
> > However, there are a few nice benefits of the proposal of unaligned
> > checkpoints over Chandy-Lamport.
> >
> > *## Benefits of Unaligned Checkpoints*
> >
> > (1) It is very similar to the original algorithm (can be seen an an
> > optional feature purely in the network stack) and thus can share lot's of
> > code paths.
> >
> > (2) Less data stored. If we make the "jump over buffers" part timeout
> based
> > (for example barrier overtakes buffers if not flushed within 10ms) then
> > checkpoints are in the common case of flowing pipelines aligned without
> > in-flight data. Only back pressured cases store some in-flight data,
> which
> > means we don't regress in the common case and only fix the back pressure
> > case.
> >
> > (3) Faster checkpoints. Chandy Lamport still waits for all barriers to
> > arrive naturally, logging on the way. If data processing is slow, this
> can
> > still take quite a while.
> >
> > ==> I think both these points are strong reasons to not change the
> > mechanism away from "trigger sources" and start with CL-style "trigger
> all".
> >
> >
> > *## Possible ways to combine Chandy Lamport and Unaligned Checkpoints*
> >
> > We can think about something like "take state snapshot on first barrier"
> > and then store buffers until the other barriers arrive. Inside the
> network
> > stack, barriers could still overtake and persist buffers.
> > The benefit would be less latency increase in the channels which already
> > have received barriers.
> > However, as mentioned before, not prioritizing the inputs from which
> > barriers are still missing can also have an adverse effect.
> >
> >
> > *## Concerning upgrades*
> >
> > I think it is a fair restriction to say that upgrades need to happen on
> > aligned checkpoints. It is a rare enough operation.
> >
> >
> > *## Concerning re-scaling (changing parallelism)*
> >
> > We need to support that on unaligned checkpoints as well. There are
> several
> > feature proposals about automatic scaling, especially down scaling in
> case
> > of missing resources. The last snapshot might be a regular checkpoint, so
> > all checkpoints need to support rescaling.
> >
> >
> > *## Concerning end-to-end checkpoint duration and "trigger sources"
> versus
> > "trigger all"*
> >
> > I think for the end-to-end checkpoint duration, an "overtake buffers"
> > approach yields faster checkpoints, as mentioned above (Chandy Lamport
> > logging still needs to wait for barrier to flow).
> >
> > I don't see the benefit of a "trigger all tasks via RPC concurrently"
> > approach. Bear in mind that it is still a globally coordinated approach
> and
> > you need to wait for the global checkpoint to complete before committing
> > any side effects.
> > I believe that the checkpoint time is more determined by the state
> > checkpoint writing, and the global coordination and metadata commit, than
> > by the difference in alignment time between "trigger from source and jump
> > over buffers" versus "trigger all tasks concurrently".
> >
> > Trying to optimize a few tens of milliseconds out of the network stack
> > sends (and changing the overall checkpointing approach completely for
> that)
> > while staying with a globally coordinated checkpoint will send us down a
> > path to a dead end.
> >
> > To really bring task persistence latency down to 10s of milliseconds (so
> we
> > can frequently commit in sinks), we need to take an approach without any
> > global coordination. Tasks need to establish a persistent recovery point
> > individually and at their own discretion, only then can it be frequent
> > enough. To get there, they would need to decouple themselves from the
> > predecessor and successor tasks (via something like persistent channels).
> > This is a different discussion, though, somewhat orthogonal to this one
> > here.
> >
> > Best,
> > Stephan
> >
> >
> > On Wed, 

Re: [DISCUSS] Repository split

2019-08-14 Thread Chesnay Schepler

Let's recap a bit:

Several people have raised the argument that build times can be kept in 
check via other means (mostly differential builds via some means, be it 
custom scripts or switching to gradle). I will start a separate 
discussion thread on this topic, since it is a useful discussion in any 
case.
I agree with this, and believe it is feasible to update the CI process 
to behave as if the repository was split.



The suggestion of a "project split" within a single repository was 
brought up.
This approach is a mixed bag; it avoids the downsides to the development 
process that multiple repositories would incur, but also only has few 
upsides. It seems primarily relevant for local development, where one 
might want to skip certain modules when running tests.


There's no benefit from the CI side: since we're still limited to 1 
.travis.yml, whatever rules we want to set up (e.g., "do not test core 
if only connectors are modified") have to be handled by the CI scripts 
regardless of whether the project is split or not.


Overall, I'd like to put this item on ice for the time being; the 
subsequent item is related, vastly more impactful and may also render 
this item obsolete.



A major topic of discussion is that of the development process. It was 
pointed how that having a split repository makes the dev process more 
complicated, since certain changes turn into a 2 step process (merge to 
core, then merge to connectors). Others have pointed out that this may 
actually be an advantage, as it (to some extent) enforces that changes 
to core are also tested in core.


I find myself more in the latter camp; it is all to easy for people to 
make a change to the core while making whatever adjustments to 
connectors to make things fit. A recent change to the ClosureCleaner in 
1.8.0  comes to mind, 
which, with a split repo, may have resulted in build failures in the 
connectors project. (provided that the time-frame between the 2 merges 
is sufficiently large...) As Arvid pointed out, having to feel the pain 
that users have to go through may not be such a bad thing.


This is a fundamental discussion as to whether we want to continue with 
a centralized development of all components.


Robert also pointed out that such a split could result in us 
establishing entirely separate projects. We've had times in the past 
(like the first flink-ml library) where such a setup may have simplified 
things (back then we had lot's of contributors but no committer to 
shepherd the effort; a separate project could be more lenient when it 
comes to appointing new committers).



@Robert We should have a SNAPSHOT dependency /somewhere/ in the 
connector repo, to detect issues (like the ClosureCleaner one) in a 
timely manner and to prepare for new features so that we can have a 
timely release after core, but not necessarily on the master branch.


@Bowen I have implemented and deployed your suggestion to cancel Travis 
builds if the associated PR has been closed.



On 07/08/2019 13:14, Chesnay Schepler wrote:

Hello everyone,

The Flink project sees an ever-increasing amount of dev activity, both 
in terms of reworked and new features.


This is of course an excellent situation to be in, but we are getting 
to a point where the associate downsides are becoming increasingly 
troublesome.


The ever increasing build times, in addition to unstable tests, 
significantly slow down the develoment process.
Additionally, pull requests for smaller features frequently slip 
through the crasks as they are being buried under a mountain of other 
pull requests.


As a result I'd like to start a discussion on splitting the Flink 
repository.


In this mail I will outline the core idea, and what problems I 
currently envision.


I'd specifically like to encourage those who were part of similar 
initiatives in other projects to share the experiences and ideas.



   General Idea

For starters, the idea is to create a new repository for 
"flink-connectors".
For the remainder of this mail, the current Flink repository is 
referred to as "flink-main".


There are also other candidates that we could discuss in the future, 
like flink-libraries (the next top-priority repo to ease flink-ml 
development), metric reporters, filesystems and flink-formats.


Moving out flink-connectors provides the most benefits, as we straight 
away save at-least an hour of testing time, and not being included in 
the binary distribution simplifies a few things.



   Problems to solve

To make this a reality there's a number of questions we have to 
discuss; some in the short-term, others in the long-term.


1) Git history

   We have to decide whether we want to rewrite the history of sub
   repositories to only contain diffs/commits related to this part of
   Flink, or whether we just fork from some commit in flink-main and
   add a commit to the connector repo that "transforms" it from
   

Re: Checkpointing under backpressure

2019-08-14 Thread Paris Carbone
+1 I think we are on the same page Stephan.

Rescaling on unaligned checkpoint sounds challenging and a bit unnecessary. No?
Why not sticking to aligned snapshots for live reconfiguration/rescaling? It’s 
a pretty rare operation and it would simplify things by a lot. Everything can 
be “staged” upon alignment including replacing channels and tasks.

-Paris

> On 14 Aug 2019, at 13:39, Stephan Ewen  wrote:
> 
> Hi all!
> 
> Yes, the first proposal of "unaligend checkpoints" (probably two years back
> now) drew a major inspiration from Chandy Lamport, as did actually the
> original checkpointing algorithm.
> 
> "Logging data between first and last barrier" versus "barrier jumping over
> buffer and storing those buffers" is pretty close same.
> However, there are a few nice benefits of the proposal of unaligned
> checkpoints over Chandy-Lamport.
> 
> *## Benefits of Unaligned Checkpoints*
> 
> (1) It is very similar to the original algorithm (can be seen an an
> optional feature purely in the network stack) and thus can share lot's of
> code paths.
> 
> (2) Less data stored. If we make the "jump over buffers" part timeout based
> (for example barrier overtakes buffers if not flushed within 10ms) then
> checkpoints are in the common case of flowing pipelines aligned without
> in-flight data. Only back pressured cases store some in-flight data, which
> means we don't regress in the common case and only fix the back pressure
> case.
> 
> (3) Faster checkpoints. Chandy Lamport still waits for all barriers to
> arrive naturally, logging on the way. If data processing is slow, this can
> still take quite a while.
> 
> ==> I think both these points are strong reasons to not change the
> mechanism away from "trigger sources" and start with CL-style "trigger all".
> 
> 
> *## Possible ways to combine Chandy Lamport and Unaligned Checkpoints*
> 
> We can think about something like "take state snapshot on first barrier"
> and then store buffers until the other barriers arrive. Inside the network
> stack, barriers could still overtake and persist buffers.
> The benefit would be less latency increase in the channels which already
> have received barriers.
> However, as mentioned before, not prioritizing the inputs from which
> barriers are still missing can also have an adverse effect.
> 
> 
> *## Concerning upgrades*
> 
> I think it is a fair restriction to say that upgrades need to happen on
> aligned checkpoints. It is a rare enough operation.
> 
> 
> *## Concerning re-scaling (changing parallelism)*
> 
> We need to support that on unaligned checkpoints as well. There are several
> feature proposals about automatic scaling, especially down scaling in case
> of missing resources. The last snapshot might be a regular checkpoint, so
> all checkpoints need to support rescaling.
> 
> 
> *## Concerning end-to-end checkpoint duration and "trigger sources" versus
> "trigger all"*
> 
> I think for the end-to-end checkpoint duration, an "overtake buffers"
> approach yields faster checkpoints, as mentioned above (Chandy Lamport
> logging still needs to wait for barrier to flow).
> 
> I don't see the benefit of a "trigger all tasks via RPC concurrently"
> approach. Bear in mind that it is still a globally coordinated approach and
> you need to wait for the global checkpoint to complete before committing
> any side effects.
> I believe that the checkpoint time is more determined by the state
> checkpoint writing, and the global coordination and metadata commit, than
> by the difference in alignment time between "trigger from source and jump
> over buffers" versus "trigger all tasks concurrently".
> 
> Trying to optimize a few tens of milliseconds out of the network stack
> sends (and changing the overall checkpointing approach completely for that)
> while staying with a globally coordinated checkpoint will send us down a
> path to a dead end.
> 
> To really bring task persistence latency down to 10s of milliseconds (so we
> can frequently commit in sinks), we need to take an approach without any
> global coordination. Tasks need to establish a persistent recovery point
> individually and at their own discretion, only then can it be frequent
> enough. To get there, they would need to decouple themselves from the
> predecessor and successor tasks (via something like persistent channels).
> This is a different discussion, though, somewhat orthogonal to this one
> here.
> 
> Best,
> Stephan
> 
> 
> On Wed, Aug 14, 2019 at 12:37 PM Piotr Nowojski  wrote:
> 
>> Hi again,
>> 
>> Zhu Zhu let me think about this more. Maybe as Paris is writing, we do not
>> need to block any channels at all, at least assuming credit base flow
>> control. Regarding what should happen with the following checkpoint is
>> another question. Also, should we support concurrent checkpoints and
>> subsuming checkpoints as we do now? Maybe not…
>> 
>> Paris
>> 
>> Re
>> I. 2. a) and b) - yes, this would have to be taken into an account
>> I. 2. c) and IV. 2. - 

Re: Checkpointing under backpressure

2019-08-14 Thread Paris Carbone
Thanks for the responses. Starts getting a bit more clear for everyone now. 

@Zhuzhu overlapping unaligned snapshots should be aborted/avoided imho.
@Piotr point II, it was a little too quickly written, sorry about that.
Simply put the two following approaches are equivalent for a valid checkpoint. 

1. A -> S -> F
2. S -> F -> L

Other variants introduce issues. For example:

3. L -> S (duplicate side effects)
4. L -> A -> S  (again duplicates)
5. S -> L -> A  (valid but unnecessary channel blocking)

Abbreviations
—

S: Capture state
L: Log pending channels
F: Forward Marker
A: Align markers by blocking non-pending channels

What I like about the Chandy Lamport approach (2.) initiated from sources is 
that:
- Snapshotting imposes no modification to normal processing. 
- Can exploit pipeline parallelism if started from the sources, 
following the natural processing order without concurrent operations to 
backends (think of long pipelines, e.g., 30 operators with high parallelism, 
e.g. 100)

I would love to see a design doc soon and discuss ideas there instead. Might 
not seem like it but I don’t fancy writing long and boring emails. :)

-Paris

> On 14 Aug 2019, at 13:10, Yun Gao  wrote:
> 
>Hi,
>   Very thanks for sharing the thoughts on the unaligned checkpoint !
>
>  Another question regarding I 2.C (Performance) by Paris is that do we 
> always snapshot and broadcast the marks once the task receives the first mark 
> from JM o? If so, then we will always need to snapshot all the records before 
> the next barriers in all the input channels . However, I thinks users may be 
> able to tolerate a fixed interval for the checkpointing, and we may postpone 
> the snapshot and broadcast till a configurable fixed time passed. With such a 
> postpone, jobs without back pressure could still avoid most IO operations and 
> storage overhead, and jobs with back pressure could also be able to finish 
> the checkpoint in a fixed interval with less IO overhead.  
> 
> Best, 
> Yun
> 
> --
> From:Piotr Nowojski 
> Send Time:2019 Aug. 14 (Wed.) 18:38
> To:Paris Carbone 
> Cc:dev ; zhijiang ; Nico 
> Kruber 
> Subject:Re: Checkpointing under backpressure
> 
> Hi again,
> 
> Zhu Zhu let me think about this more. Maybe as Paris is writing, we do not 
> need to block any channels at all, at least assuming credit base flow 
> control. Regarding what should happen with the following checkpoint is 
> another question. Also, should we support concurrent checkpoints and 
> subsuming checkpoints as we do now? Maybe not…
> 
> Paris
> 
> Re 
> I. 2. a) and b) - yes, this would have to be taken into an account
> I. 2. c) and IV. 2. - without those, end to end checkpoint time will probably 
> be longer than it could be. It might affect external systems. For example 
> Kafka, which automatically time outs lingering transactions, and for us, the 
> transaction time is equal to the time between two checkpoints.
> 
> II 1. - I’m confused. To make things straight. Flink is currently 
> snapshotting once it receives all of the checkpoint barriers from all of the 
> input channels and only then it broadcasts the checkpoint barrier down the 
> stream. And this is correct from exactly-once perspective. 
> 
> As far as I understand, your proposal based on Chandy Lamport algorithm, is 
> snapshotting the state of the operator on the first checkpoint barrier, which 
> also looks correct to me.
> 
> III. 1. As I responded to Zhu Zhu, let me think a bit more about this.
> 
> V. Yes, we still need aligned checkpoints, as they are easier for state 
> migration and upgrades. 
> 
> Piotrek
> 
> > On 14 Aug 2019, at 11:22, Paris Carbone  wrote:
> > 
> > Now I see a little more clearly what you have in mind. Thanks for the 
> > explanation!
> > There are a few intermixed concepts here, some how to do with correctness 
> > some with performance.
> > Before delving deeper I will just enumerate a few things to make myself a 
> > little more helpful if I can.
> > 
> > I. Initiation
> > -
> > 
> > 1. RPC to sources only is a less intrusive way to initiate snapshots since 
> > you utilize better pipeline parallelism (only a small subset of tasks is 
> > running progressively the protocol at a time, if snapshotting is async the 
> > overall overhead might not even be observable).
> > 
> > 2. If we really want an RPC to all initiation take notice of the following 
> > implications:
> >  
> >  a. (correctness) RPC calls are not guaranteed to arrive in every task 
> > before a marker from a preceding task.
> > 
> >  b. (correctness) Either the RPC call OR the first arriving marker should 
> > initiate the algorithm. Whichever comes first. If you only do it per RPC 
> > call then you capture a "late" state that includes side effects of already 
> > logged events.
> > 
> >  c. (performance) Lots of IO will be invoked at the same time on 

[jira] [Created] (FLINK-13717) allow to set taskmanager.host and taskmanager.bind-host separately

2019-08-14 Thread Robert Fiser (JIRA)
Robert Fiser created FLINK-13717:


 Summary: allow to set taskmanager.host and taskmanager.bind-host 
separately
 Key: FLINK-13717
 URL: https://issues.apache.org/jira/browse/FLINK-13717
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Configuration, Runtime / Network
Affects Versions: 1.8.1, 1.9.0
Reporter: Robert Fiser
 Fix For: 1.9.1


We trying to use flink in docker container with bridge network.

Without specifying taskmanager.host taskmanager binds the host/address which is 
not visible in cluster. It's same behavior when taskmanager.host is set to 
0.0.0.0.

When it is se to external address or host name then taskmanager cannot bind the 
address because of bridge network.

So we need to set taskmanager.host which will be reported to jobmanager and 
taskmanager.bind-host wchich can taskmanager bind inside the container

It similar to https://issues.apache.org/jira/browse/FLINK-2821 but the problem 
is with taskmanagers.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [DISCUSS] FLIP-52: Remove legacy Program interface.

2019-08-14 Thread Stephan Ewen
+1

the "main" method is the overwhelming default. getting rid of "two ways to
do things" is a good idea.

On Wed, Aug 14, 2019 at 1:42 PM Kostas Kloudas  wrote:

> Hi all,
>
> As discussed in [1] , the Program interface seems to be outdated and
> there seems to be
> no objection to remove it.
>
> Given that this interface is PublicEvolving, its removal should pass
> through a FLIP and
> this discussion and the associated FLIP are exactly for that purpose.
>
> Please let me know what you think and if it is ok to proceed with its
> removal.
>
> Cheers,
> Kostas
>
> link to FLIP-52 :
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=125308637
>
> [1]
> https://lists.apache.org/x/thread.html/7ffc9936a384b891dbcf0a481d26c6d13b2125607c200577780d1e18@%3Cdev.flink.apache.org%3E
>


Re: Checkpointing under backpressure

2019-08-14 Thread Stephan Ewen
Hi all!

Yes, the first proposal of "unaligend checkpoints" (probably two years back
now) drew a major inspiration from Chandy Lamport, as did actually the
original checkpointing algorithm.

"Logging data between first and last barrier" versus "barrier jumping over
buffer and storing those buffers" is pretty close same.
However, there are a few nice benefits of the proposal of unaligned
checkpoints over Chandy-Lamport.

*## Benefits of Unaligned Checkpoints*

(1) It is very similar to the original algorithm (can be seen an an
optional feature purely in the network stack) and thus can share lot's of
code paths.

(2) Less data stored. If we make the "jump over buffers" part timeout based
(for example barrier overtakes buffers if not flushed within 10ms) then
checkpoints are in the common case of flowing pipelines aligned without
in-flight data. Only back pressured cases store some in-flight data, which
means we don't regress in the common case and only fix the back pressure
case.

(3) Faster checkpoints. Chandy Lamport still waits for all barriers to
arrive naturally, logging on the way. If data processing is slow, this can
still take quite a while.

==> I think both these points are strong reasons to not change the
mechanism away from "trigger sources" and start with CL-style "trigger all".


*## Possible ways to combine Chandy Lamport and Unaligned Checkpoints*

We can think about something like "take state snapshot on first barrier"
and then store buffers until the other barriers arrive. Inside the network
stack, barriers could still overtake and persist buffers.
The benefit would be less latency increase in the channels which already
have received barriers.
However, as mentioned before, not prioritizing the inputs from which
barriers are still missing can also have an adverse effect.


*## Concerning upgrades*

I think it is a fair restriction to say that upgrades need to happen on
aligned checkpoints. It is a rare enough operation.


*## Concerning re-scaling (changing parallelism)*

We need to support that on unaligned checkpoints as well. There are several
feature proposals about automatic scaling, especially down scaling in case
of missing resources. The last snapshot might be a regular checkpoint, so
all checkpoints need to support rescaling.


*## Concerning end-to-end checkpoint duration and "trigger sources" versus
"trigger all"*

I think for the end-to-end checkpoint duration, an "overtake buffers"
approach yields faster checkpoints, as mentioned above (Chandy Lamport
logging still needs to wait for barrier to flow).

I don't see the benefit of a "trigger all tasks via RPC concurrently"
approach. Bear in mind that it is still a globally coordinated approach and
you need to wait for the global checkpoint to complete before committing
any side effects.
I believe that the checkpoint time is more determined by the state
checkpoint writing, and the global coordination and metadata commit, than
by the difference in alignment time between "trigger from source and jump
over buffers" versus "trigger all tasks concurrently".

Trying to optimize a few tens of milliseconds out of the network stack
sends (and changing the overall checkpointing approach completely for that)
while staying with a globally coordinated checkpoint will send us down a
path to a dead end.

To really bring task persistence latency down to 10s of milliseconds (so we
can frequently commit in sinks), we need to take an approach without any
global coordination. Tasks need to establish a persistent recovery point
individually and at their own discretion, only then can it be frequent
enough. To get there, they would need to decouple themselves from the
predecessor and successor tasks (via something like persistent channels).
This is a different discussion, though, somewhat orthogonal to this one
here.

Best,
Stephan


On Wed, Aug 14, 2019 at 12:37 PM Piotr Nowojski  wrote:

> Hi again,
>
> Zhu Zhu let me think about this more. Maybe as Paris is writing, we do not
> need to block any channels at all, at least assuming credit base flow
> control. Regarding what should happen with the following checkpoint is
> another question. Also, should we support concurrent checkpoints and
> subsuming checkpoints as we do now? Maybe not…
>
> Paris
>
> Re
> I. 2. a) and b) - yes, this would have to be taken into an account
> I. 2. c) and IV. 2. - without those, end to end checkpoint time will
> probably be longer than it could be. It might affect external systems. For
> example Kafka, which automatically time outs lingering transactions, and
> for us, the transaction time is equal to the time between two checkpoints.
>
> II 1. - I’m confused. To make things straight. Flink is currently
> snapshotting once it receives all of the checkpoint barriers from all of
> the input channels and only then it broadcasts the checkpoint barrier down
> the stream. And this is correct from exactly-once perspective.
>
> As far as I understand, your proposal based on 

Re: Checkpointing under backpressure

2019-08-14 Thread Yun Gao
   Hi,
  Very thanks for sharing the thoughts on the unaligned checkpoint !
  Another question regarding I 2.C (Performance) by Paris is that do we 
always snapshot and broadcast the marks once the task receives the first mark 
from JM o? If so, then we will always need to snapshot all the records before 
the next barriers in all the input channels . However, I thinks users may be 
able to tolerate a fixed interval for the checkpointing, and we may postpone 
the snapshot and broadcast till a configurable fixed time passed. With such a 
postpone, jobs without back pressure could still avoid most IO operations and 
storage overhead, and jobs with back pressure could also be able to finish the 
checkpoint in a fixed interval with less IO overhead.  

Best, 
Yun


--
From:Piotr Nowojski 
Send Time:2019 Aug. 14 (Wed.) 18:38
To:Paris Carbone 
Cc:dev ; zhijiang ; Nico 
Kruber 
Subject:Re: Checkpointing under backpressure

Hi again,

Zhu Zhu let me think about this more. Maybe as Paris is writing, we do not need 
to block any channels at all, at least assuming credit base flow control. 
Regarding what should happen with the following checkpoint is another question. 
Also, should we support concurrent checkpoints and subsuming checkpoints as we 
do now? Maybe not…

Paris

Re 
I. 2. a) and b) - yes, this would have to be taken into an account
I. 2. c) and IV. 2. - without those, end to end checkpoint time will probably 
be longer than it could be. It might affect external systems. For example 
Kafka, which automatically time outs lingering transactions, and for us, the 
transaction time is equal to the time between two checkpoints.

II 1. - I’m confused. To make things straight. Flink is currently snapshotting 
once it receives all of the checkpoint barriers from all of the input channels 
and only then it broadcasts the checkpoint barrier down the stream. And this is 
correct from exactly-once perspective. 

As far as I understand, your proposal based on Chandy Lamport algorithm, is 
snapshotting the state of the operator on the first checkpoint barrier, which 
also looks correct to me.

III. 1. As I responded to Zhu Zhu, let me think a bit more about this.

V. Yes, we still need aligned checkpoints, as they are easier for state 
migration and upgrades. 

Piotrek

> On 14 Aug 2019, at 11:22, Paris Carbone  wrote:
> 
> Now I see a little more clearly what you have in mind. Thanks for the 
> explanation!
> There are a few intermixed concepts here, some how to do with correctness 
> some with performance.
> Before delving deeper I will just enumerate a few things to make myself a 
> little more helpful if I can.
> 
> I. Initiation
> -
> 
> 1. RPC to sources only is a less intrusive way to initiate snapshots since 
> you utilize better pipeline parallelism (only a small subset of tasks is 
> running progressively the protocol at a time, if snapshotting is async the 
> overall overhead might not even be observable).
> 
> 2. If we really want an RPC to all initiation take notice of the following 
> implications:
>  
>  a. (correctness) RPC calls are not guaranteed to arrive in every task before 
> a marker from a preceding task.
> 
>  b. (correctness) Either the RPC call OR the first arriving marker should 
> initiate the algorithm. Whichever comes first. If you only do it per RPC call 
> then you capture a "late" state that includes side effects of already logged 
> events.
> 
>  c. (performance) Lots of IO will be invoked at the same time on the backend 
> store from all tasks. This might lead to high congestion in async snapshots.
> 
> II. Capturing State First
> -
> 
> 1. (correctness) Capturing state at the last marker sounds incorrect to me 
> (state contains side effects of already logged events based on the proposed 
> scheme). This results into duplicate processing. No?
> 
> III. Channel Blocking / "Alignment"
> ---
> 
> 1. (performance?) What is the added benefit? We dont want a "complete" 
> transactional snapshot, async snapshots are purely for failure-recovery. 
> Thus, I dont see why this needs to be imposed at the expense of 
> performance/throughput. With the proposed scheme the whole dataflow anyway 
> enters snapshotting/logging mode so tasks more or less snapshot concurrently. 
> 
> IV Marker Bypassing
> ---
> 
> 1. (correctness) This leads to equivalent in-flight snapshots so with some 
> quick thinking  correct. I will try to model this later and get back to you 
> in case I find something wrong.
> 
> 2. (performance) It also sounds like a meaningful optimisation! I like 
> thinking of this as a push-based snapshot. i.e., the producing task somehow 
> triggers forward a consumer/channel to capture its state. By example consider 
> T1 -> |marker t1| -> T2. 
> 
> V. Usage of "Async" Snapshots
> -
> 
> 1. Do you see this 

Re: [DISCUSS] Drop stale class Program

2019-08-14 Thread Zili Chen
Thanks for your attentions!

Thank Kostas for creating the JIRA and drafting the FLIP.
I would volunteer to help review it :-)

It's good to see that we make progress on this thread.

Best,
tison.


Kostas Kloudas  于2019年8月14日周三 下午6:39写道:

> I already opened a JIRA for the removal and I will also create a (short)
> FLIP, as it is a PublicEvolving interface and its removal should go through
> a FLIP.
>
> The JIRA can be found here
> https://issues.apache.org/jira/browse/FLINK-13713
>
> Cheers,
> Kostas
>
> On Wed, Aug 14, 2019 at 12:13 PM Stephan Ewen  wrote:
>
> > +1 to drop it.
> >
> > It's one of the oldest pieces of legacy.
> >
> > On Wed, Aug 14, 2019 at 12:07 PM Aljoscha Krettek 
> > wrote:
> >
> > > Hi,
> > >
> > > I would be in favour of removing Program (and the code paths that
> support
> > > it) for Flink 1.10. Most users of Flink don’t actually know it exists
> and
> > > it is only making our code more complicated. Going forward with the new
> > > Client API discussions will be a lot easier without it as well.
> > >
> > > Best,
> > > Aljoscha
> > >
> > > > On 14. Aug 2019, at 11:08, Kostas Kloudas 
> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > It is nice to have this discussion.
> > > >
> > > > I am totally up for removing the unused Program interface, as this
> will
> > > > simplify a lot of other code paths in the ClusterClient and
> elsewhere.
> > > >
> > > > Also about the easier integration of Flink with other frameworks,
> there
> > > > is another discussion in the mailing list with exactly this topic:
> > > > [DISCUSS] Flink client api enhancement for downstream project
> > > >
> > > > Cheers,
> > > > Kostas
> > > >
> > > >
> > > > On Tue, Jul 30, 2019 at 1:38 PM Zili Chen 
> > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> With a one-week survey in user list[1], nobody expect Flavio and
> Jeff
> > > >> participant the thread.
> > > >>
> > > >> Flavio shared his experience with a revised Program like interface.
> > > >> This could be regraded as downstream integration and in client api
> > > >> enhancements document we propose rich interface for this
> integration.
> > > >> Anyway, the Flink scope Program is less functional than it should
> be.
> > > >>
> > > >> With no objection I'd like to push on this thread. We need a
> committer
> > > >> participant this thread to shepherd the removal/deprecation of
> > Program,
> > > a
> > > >> @PublicEvolving interface. Anybody has spare time? Or anything I can
> > do
> > > >> to make progress?
> > > >>
> > > >> Best,
> > > >> tison.
> > > >>
> > > >> [1]
> > > >>
> > > >>
> > >
> >
> https://lists.apache.org/thread.html/37445e43729cf7eaeb0aa09133d3980b62f891c5ee69d2c3c3e76ab5@%3Cuser.flink.apache.org%3E
> > > >>
> > > >>
> > > >> Zili Chen  于2019年7月22日周一 下午8:38写道:
> > > >>
> > > >>> Hi,
> > > >>>
> > > >>> I created a thread for survey in user list[1]. Please take
> > participate
> > > in
> > > >>> if interested.
> > > >>>
> > > >>> Best,
> > > >>> tison.
> > > >>>
> > > >>> [1]
> > > >>>
> > > >>
> > >
> >
> https://lists.apache.org/thread.html/37445e43729cf7eaeb0aa09133d3980b62f891c5ee69d2c3c3e76ab5@%3Cuser.flink.apache.org%3E
> > > >>>
> > > >>>
> > > >>> Flavio Pompermaier  于2019年7月19日周五 下午5:18写道:
> > > >>>
> > >  +1 to remove directly the Program class (I think nobody use it and
> > > it's
> > >  not
> > >  supported at all by REST services and UI).
> > >  Moreover it requires a lot of transitive dependencies while it
> > should
> > > >> be a
> > >  very simple thing..
> > >  +1 to add this discussion to "Flink client api enhancement"
> > > 
> > >  On Fri, Jul 19, 2019 at 11:14 AM Biao Liu 
> > wrote:
> > > 
> > > > To Flavio, good point for the integration suggestion.
> > > >
> > > > I think it should be considered in the "Flink client api
> > enhancement"
> > > > discussion. But the outdated API should be deprecated somehow.
> > > >
> > > > Flavio Pompermaier  于2019年7月19日周五
> 下午4:21写道:
> > > >
> > > >> In my experience a basic "official" (but optional) program
> > > >> description
> > > >> would be very useful indeed (in order to ease the integration
> with
> > >  other
> > > >> frameworks).
> > > >>
> > > >> Of course it should be extended and integrated with the REST
> > > >> services
> > >  and
> > > >> the Web UI (when defined) in order to be useful..
> > > >> It ease to show to the user what a job does and which parameters
> > it
> > > >> requires (optional or mandatory) and with a proper help
> > description.
> > > >> Indeed, when we write a Flink job we implement the following
> > >  interface:
> > > >>
> > > >> public interface FlinkJob {
> > > >>  String getDescription();
> > > >>  List getParameters();
> > > >> boolean isStreamingOrBatch();
> > > >> }
> > > >>
> > > >> public class ClusterJobParameter {
> > > >>
> > > >>  private String paramName;
> > > >>  private String 

[jira] [Created] (FLINK-13716) Remove Package-related chinese documentation

2019-08-14 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-13716:
--

 Summary: Remove Package-related chinese documentation
 Key: FLINK-13716
 URL: https://issues.apache.org/jira/browse/FLINK-13716
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.9.0
Reporter: Kostas Kloudas






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13715) Remove Package-related english documentation.

2019-08-14 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-13715:
--

 Summary: Remove Package-related english documentation.
 Key: FLINK-13715
 URL: https://issues.apache.org/jira/browse/FLINK-13715
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13714) Remove Package-related code.

2019-08-14 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-13714:
--

 Summary: Remove Package-related code.
 Key: FLINK-13714
 URL: https://issues.apache.org/jira/browse/FLINK-13714
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [DISCUSS] Drop stale class Program

2019-08-14 Thread Kostas Kloudas
I already opened a JIRA for the removal and I will also create a (short)
FLIP, as it is a PublicEvolving interface and its removal should go through
a FLIP.

The JIRA can be found here https://issues.apache.org/jira/browse/FLINK-13713

Cheers,
Kostas

On Wed, Aug 14, 2019 at 12:13 PM Stephan Ewen  wrote:

> +1 to drop it.
>
> It's one of the oldest pieces of legacy.
>
> On Wed, Aug 14, 2019 at 12:07 PM Aljoscha Krettek 
> wrote:
>
> > Hi,
> >
> > I would be in favour of removing Program (and the code paths that support
> > it) for Flink 1.10. Most users of Flink don’t actually know it exists and
> > it is only making our code more complicated. Going forward with the new
> > Client API discussions will be a lot easier without it as well.
> >
> > Best,
> > Aljoscha
> >
> > > On 14. Aug 2019, at 11:08, Kostas Kloudas  wrote:
> > >
> > > Hi all,
> > >
> > > It is nice to have this discussion.
> > >
> > > I am totally up for removing the unused Program interface, as this will
> > > simplify a lot of other code paths in the ClusterClient and elsewhere.
> > >
> > > Also about the easier integration of Flink with other frameworks, there
> > > is another discussion in the mailing list with exactly this topic:
> > > [DISCUSS] Flink client api enhancement for downstream project
> > >
> > > Cheers,
> > > Kostas
> > >
> > >
> > > On Tue, Jul 30, 2019 at 1:38 PM Zili Chen 
> wrote:
> > >
> > >> Hi,
> > >>
> > >> With a one-week survey in user list[1], nobody expect Flavio and Jeff
> > >> participant the thread.
> > >>
> > >> Flavio shared his experience with a revised Program like interface.
> > >> This could be regraded as downstream integration and in client api
> > >> enhancements document we propose rich interface for this integration.
> > >> Anyway, the Flink scope Program is less functional than it should be.
> > >>
> > >> With no objection I'd like to push on this thread. We need a committer
> > >> participant this thread to shepherd the removal/deprecation of
> Program,
> > a
> > >> @PublicEvolving interface. Anybody has spare time? Or anything I can
> do
> > >> to make progress?
> > >>
> > >> Best,
> > >> tison.
> > >>
> > >> [1]
> > >>
> > >>
> >
> https://lists.apache.org/thread.html/37445e43729cf7eaeb0aa09133d3980b62f891c5ee69d2c3c3e76ab5@%3Cuser.flink.apache.org%3E
> > >>
> > >>
> > >> Zili Chen  于2019年7月22日周一 下午8:38写道:
> > >>
> > >>> Hi,
> > >>>
> > >>> I created a thread for survey in user list[1]. Please take
> participate
> > in
> > >>> if interested.
> > >>>
> > >>> Best,
> > >>> tison.
> > >>>
> > >>> [1]
> > >>>
> > >>
> >
> https://lists.apache.org/thread.html/37445e43729cf7eaeb0aa09133d3980b62f891c5ee69d2c3c3e76ab5@%3Cuser.flink.apache.org%3E
> > >>>
> > >>>
> > >>> Flavio Pompermaier  于2019年7月19日周五 下午5:18写道:
> > >>>
> >  +1 to remove directly the Program class (I think nobody use it and
> > it's
> >  not
> >  supported at all by REST services and UI).
> >  Moreover it requires a lot of transitive dependencies while it
> should
> > >> be a
> >  very simple thing..
> >  +1 to add this discussion to "Flink client api enhancement"
> > 
> >  On Fri, Jul 19, 2019 at 11:14 AM Biao Liu 
> wrote:
> > 
> > > To Flavio, good point for the integration suggestion.
> > >
> > > I think it should be considered in the "Flink client api
> enhancement"
> > > discussion. But the outdated API should be deprecated somehow.
> > >
> > > Flavio Pompermaier  于2019年7月19日周五 下午4:21写道:
> > >
> > >> In my experience a basic "official" (but optional) program
> > >> description
> > >> would be very useful indeed (in order to ease the integration with
> >  other
> > >> frameworks).
> > >>
> > >> Of course it should be extended and integrated with the REST
> > >> services
> >  and
> > >> the Web UI (when defined) in order to be useful..
> > >> It ease to show to the user what a job does and which parameters
> it
> > >> requires (optional or mandatory) and with a proper help
> description.
> > >> Indeed, when we write a Flink job we implement the following
> >  interface:
> > >>
> > >> public interface FlinkJob {
> > >>  String getDescription();
> > >>  List getParameters();
> > >> boolean isStreamingOrBatch();
> > >> }
> > >>
> > >> public class ClusterJobParameter {
> > >>
> > >>  private String paramName;
> > >>  private String paramType = "string";
> > >>  private String paramDesc;
> > >>  private String paramDefaultValue;
> > >>  private Set choices;
> > >>  private boolean mandatory;
> > >> }
> > >>
> > >> This really helps to launch a Flink job by a frontend (if the rest
> > > services
> > >> returns back those infos).
> > >>
> > >> Best,
> > >> Flavio
> > >>
> > >> On Fri, Jul 19, 2019 at 9:57 AM Biao Liu 
> > >> wrote:
> > >>
> > >>> Hi Zili,
> > >>>
> > >>> Thank you for bring us this discussion.
> 

[jira] [Created] (FLINK-13713) Remove legacy Package interface.

2019-08-14 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-13713:
--

 Summary: Remove legacy Package interface.
 Key: FLINK-13713
 URL: https://issues.apache.org/jira/browse/FLINK-13713
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.10.0


The Package interface is an interface used in the Stratosphere days in order to 
write Flink jobs. This is now outdated as users use the Environments to write 
jobs.

This Jira proposes the removal of this interface and the related code, as it is 
used nowhere in Flink itself but it complicates parts of the codebase.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: Checkpointing under backpressure

2019-08-14 Thread Piotr Nowojski
Hi again,

Zhu Zhu let me think about this more. Maybe as Paris is writing, we do not need 
to block any channels at all, at least assuming credit base flow control. 
Regarding what should happen with the following checkpoint is another question. 
Also, should we support concurrent checkpoints and subsuming checkpoints as we 
do now? Maybe not…

Paris

Re 
I. 2. a) and b) - yes, this would have to be taken into an account
I. 2. c) and IV. 2. - without those, end to end checkpoint time will probably 
be longer than it could be. It might affect external systems. For example 
Kafka, which automatically time outs lingering transactions, and for us, the 
transaction time is equal to the time between two checkpoints.

II 1. - I’m confused. To make things straight. Flink is currently snapshotting 
once it receives all of the checkpoint barriers from all of the input channels 
and only then it broadcasts the checkpoint barrier down the stream. And this is 
correct from exactly-once perspective. 

As far as I understand, your proposal based on Chandy Lamport algorithm, is 
snapshotting the state of the operator on the first checkpoint barrier, which 
also looks correct to me.

III. 1. As I responded to Zhu Zhu, let me think a bit more about this.

V. Yes, we still need aligned checkpoints, as they are easier for state 
migration and upgrades. 

Piotrek

> On 14 Aug 2019, at 11:22, Paris Carbone  wrote:
> 
> Now I see a little more clearly what you have in mind. Thanks for the 
> explanation!
> There are a few intermixed concepts here, some how to do with correctness 
> some with performance.
> Before delving deeper I will just enumerate a few things to make myself a 
> little more helpful if I can.
> 
> I. Initiation
> -
> 
> 1. RPC to sources only is a less intrusive way to initiate snapshots since 
> you utilize better pipeline parallelism (only a small subset of tasks is 
> running progressively the protocol at a time, if snapshotting is async the 
> overall overhead might not even be observable).
> 
> 2. If we really want an RPC to all initiation take notice of the following 
> implications:
>   
>   a. (correctness) RPC calls are not guaranteed to arrive in every task 
> before a marker from a preceding task.
> 
>   b. (correctness) Either the RPC call OR the first arriving marker 
> should initiate the algorithm. Whichever comes first. If you only do it per 
> RPC call then you capture a "late" state that includes side effects of 
> already logged events.
> 
>   c. (performance) Lots of IO will be invoked at the same time on the 
> backend store from all tasks. This might lead to high congestion in async 
> snapshots.
> 
> II. Capturing State First
> -
> 
> 1. (correctness) Capturing state at the last marker sounds incorrect to me 
> (state contains side effects of already logged events based on the proposed 
> scheme). This results into duplicate processing. No?
> 
> III. Channel Blocking / "Alignment"
> ---
> 
> 1. (performance?) What is the added benefit? We dont want a "complete" 
> transactional snapshot, async snapshots are purely for failure-recovery. 
> Thus, I dont see why this needs to be imposed at the expense of 
> performance/throughput. With the proposed scheme the whole dataflow anyway 
> enters snapshotting/logging mode so tasks more or less snapshot concurrently. 
> 
> IV Marker Bypassing
> ---
> 
> 1. (correctness) This leads to equivalent in-flight snapshots so with some 
> quick thinking  correct. I will try to model this later and get back to you 
> in case I find something wrong.
> 
> 2. (performance) It also sounds like a meaningful optimisation! I like 
> thinking of this as a push-based snapshot. i.e., the producing task somehow 
> triggers forward a consumer/channel to capture its state. By example consider 
> T1 -> |marker t1| -> T2. 
> 
> V. Usage of "Async" Snapshots
> -
> 
> 1. Do you see this as a full replacement of "full" aligned 
> snapshots/savepoints? In my view async shanpshots will be needed from time to 
> time but not as frequently. Yet, it seems like a valid approach solely for 
> failure-recovery on the same configuration. Here's why:
> 
>   a. With original snapshotting there is a strong duality between 
>   a stream input (offsets) and committed side effects (internal states 
> and external commits to transactional sinks). While in the async version, 
> there are uncommitted operations (inflight records). Thus, you cannot use 
> these snapshots for e.g., submitting sql queries with snapshot isolation. 
> Also, the original snapshotting gives a lot of potential for flink to make 
> proper transactional commits externally.
> 
>   b. Reconfiguration is very tricky, you probably know that better. 
> Inflight channel state is no longer valid in a new configuration (i.e., new 
> dataflow graph, new operators, updated operator logic, 

Re: [DISCUSS] Best practice to run flink on kubernetes

2019-08-14 Thread Yang Wang
Hi till,


Thanks for your reply. I agree with you that both option 1 and 3 need to be
supported.


Option 1 is reactive mode of resource management and flink is not aware of
underlying cluster. If a user has limited resources to run flink jobs, this
option will be very useful. On the other side, option 3 is active mode
resource management. Compared with option 1, the biggest advantage is that
we could allocate resource from k8s cluster on demand. Especially batch
jobs will benefit a lot from this.


I do not mean to abandon the proposal and Implementation in FLINK-9953.
Actually i have contacted with the assignee(Chunhui Shi) to help to review
and test the PRs. After all the basic Implementations have been merged, the
production features will be considered.



Best,

Yang

Till Rohrmann  于2019年8月13日周二 下午5:36写道:

> Hi Yang,
>
> thanks for reviving the discussion about Flink's Kubernetes integration. In
> a nutshell, I think that Flink should support option 1) and 3). Concretely,
> option 1) would be covered by the reactive mode [1] which is not
> necessarily bound to Kubernetes and works in all environments equally well.
> Option 3) is the native Kubernetes integration which is described in the
> design document. Actually, the discussion had been concluded already some
> time ago and there are already multiple PRs open for adding this feature
> [2]. So maybe you could check these PRs out and help the community
> reviewing and merging this code. Based on this we could then think about
> additions/improvements which are necessary.
>
> For option 2), I think a Kubernetes operator would be a good project for
> Flink's ecosystem website [3] and does not need to be necessarily part of
> Flink's repository.
>
> [1] https://issues.apache.org/jira/browse/FLINK-10407
> [2] https://issues.apache.org/jira/browse/FLINK-9953
> [3]
>
> https://lists.apache.org/thread.html/9b873f9dc1dd56d79e0f71418b123def896ed02f57e84461bc0bacb0@%3Cdev.flink.apache.org%3E
>
> Cheers,
> Till
>
> On Mon, Aug 12, 2019 at 5:46 AM Yang Wang  wrote:
>
> > Hi kaibo,
> >
> >
> > I am really appreciated that you could share your use case.
> >
> > As you say, our users in production also could be divided into two
> groups.
> > The common users have more knowledge about flink, they could use the
> > command line to submit job and debug job from logs of job manager and
> > taskmanager in the kubenetes. And for platform users, they use the yaml
> > config files or platform web to submit flink jobs.
> >
> > Regarding your comments:
> >
> > 1. Of course, the option 1(standalone on k8s) should always work as
> > expected. Users could submit the jm/tm/svc resource files to start a
> flink
> > cluster. The option 3(k8s native integration) will support both resource
> > files and command line submission. The resource file below is to create a
> > flink perjob cluster.
> >
> > apiVersion: extensions/v1beta1
> >
> > kind: Deployment
> >
> > metadata:
> >
> >   name: flink-word-count
> >
> > spec:
> >
> >   image: flink-wordcount:latest
> >
> >   flinkConfig:
> >
> > state.checkpoints.dir:
> > file:///checkpoints/flink/externalized-checkpoints
> >
> >   jobManagerConfig:
> >
> > resources:
> >
> >   requests:
> >
> > memory: “1024Mi"
> >
> > cpu: “1”
> >
> >   taskManagerConfig:
> >
> > taskSlots: 2
> >
> > resources:
> >
> >   requests:
> >
> > memory: “1024Mi"
> >
> > cpu: “1”
> >
> >   jobId: “”
> >
> >   parallelism: 3
> >
> >   jobClassName: "org.apache.flink.streaming.examples.wordcount.WordCount"
> >
> > 2. The ability to pass job-classname will be retained. The class should
> be
> > found in the classpath of taskmanager image. The flink per-job cluster
> > describe by yaml resource in section 1 could also be submitted by flink
> > command.
> >
> > flink run -m kubernetes-cluster -p 3 -knm flink-word-count -ki
> > flink-wordcount:latest -kjm 1024 -ktm 1024 -kD
> kubernetes.jobmanager.cpu=1
> > -kD kubernetes.taskmanager.cpu=1 -kjid 
> > -kjc org.apache.flink.streaming.examples.wordcount.WordCount -kD
> > state.checkpoints.dir= file:///checkpoints/flink/externalized-checkpoints
> >
> > 3. The job-id could also be specified by -kjid just like the command
> above.
> >
> > In a nutshell, the option 3 should have all the abilities in option 1.
> > Common users and platform users are all satisfied.
> >
> >
> >
> > Best,
> >
> > Yang
> >
> >
> > Kaibo Zhou  于2019年8月11日周日 下午1:23写道:
> >
> > > Thanks for bringing this up. Obviously, option 2 and 3 are both useful
> > for
> > > fink users on kubernetes. But option 3 is easy for users that not have
> > many
> > > concepts of kubernetes, they can start flink on kubernetes quickly, I
> > think
> > > it should have a higher priority.
> > >
> > > I have worked some time to integrate flink with our platform based on
> > > kubernetes, and have some concerns on option 3 from the platform user's
> > > 

[jira] [Created] (FLINK-13711) Hive array values not properly displayed in SQL CLI

2019-08-14 Thread Rui Li (JIRA)
Rui Li created FLINK-13711:
--

 Summary: Hive array values not properly displayed in SQL CLI
 Key: FLINK-13711
 URL: https://issues.apache.org/jira/browse/FLINK-13711
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Reporter: Rui Li


Array values are displayed like:
{noformat}
 [Ljava.lang.Integer;@632~
 [Ljava.lang.Integer;@6de~
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [DISCUSS] Drop stale class Program

2019-08-14 Thread Aljoscha Krettek
Hi,

I would be in favour of removing Program (and the code paths that support it) 
for Flink 1.10. Most users of Flink don’t actually know it exists and it is 
only making our code more complicated. Going forward with the new Client API 
discussions will be a lot easier without it as well.

Best,
Aljoscha

> On 14. Aug 2019, at 11:08, Kostas Kloudas  wrote:
> 
> Hi all,
> 
> It is nice to have this discussion.
> 
> I am totally up for removing the unused Program interface, as this will
> simplify a lot of other code paths in the ClusterClient and elsewhere.
> 
> Also about the easier integration of Flink with other frameworks, there
> is another discussion in the mailing list with exactly this topic:
> [DISCUSS] Flink client api enhancement for downstream project
> 
> Cheers,
> Kostas
> 
> 
> On Tue, Jul 30, 2019 at 1:38 PM Zili Chen  wrote:
> 
>> Hi,
>> 
>> With a one-week survey in user list[1], nobody expect Flavio and Jeff
>> participant the thread.
>> 
>> Flavio shared his experience with a revised Program like interface.
>> This could be regraded as downstream integration and in client api
>> enhancements document we propose rich interface for this integration.
>> Anyway, the Flink scope Program is less functional than it should be.
>> 
>> With no objection I'd like to push on this thread. We need a committer
>> participant this thread to shepherd the removal/deprecation of Program, a
>> @PublicEvolving interface. Anybody has spare time? Or anything I can do
>> to make progress?
>> 
>> Best,
>> tison.
>> 
>> [1]
>> 
>> https://lists.apache.org/thread.html/37445e43729cf7eaeb0aa09133d3980b62f891c5ee69d2c3c3e76ab5@%3Cuser.flink.apache.org%3E
>> 
>> 
>> Zili Chen  于2019年7月22日周一 下午8:38写道:
>> 
>>> Hi,
>>> 
>>> I created a thread for survey in user list[1]. Please take participate in
>>> if interested.
>>> 
>>> Best,
>>> tison.
>>> 
>>> [1]
>>> 
>> https://lists.apache.org/thread.html/37445e43729cf7eaeb0aa09133d3980b62f891c5ee69d2c3c3e76ab5@%3Cuser.flink.apache.org%3E
>>> 
>>> 
>>> Flavio Pompermaier  于2019年7月19日周五 下午5:18写道:
>>> 
 +1 to remove directly the Program class (I think nobody use it and it's
 not
 supported at all by REST services and UI).
 Moreover it requires a lot of transitive dependencies while it should
>> be a
 very simple thing..
 +1 to add this discussion to "Flink client api enhancement"
 
 On Fri, Jul 19, 2019 at 11:14 AM Biao Liu  wrote:
 
> To Flavio, good point for the integration suggestion.
> 
> I think it should be considered in the "Flink client api enhancement"
> discussion. But the outdated API should be deprecated somehow.
> 
> Flavio Pompermaier  于2019年7月19日周五 下午4:21写道:
> 
>> In my experience a basic "official" (but optional) program
>> description
>> would be very useful indeed (in order to ease the integration with
 other
>> frameworks).
>> 
>> Of course it should be extended and integrated with the REST
>> services
 and
>> the Web UI (when defined) in order to be useful..
>> It ease to show to the user what a job does and which parameters it
>> requires (optional or mandatory) and with a proper help description.
>> Indeed, when we write a Flink job we implement the following
 interface:
>> 
>> public interface FlinkJob {
>>  String getDescription();
>>  List getParameters();
>> boolean isStreamingOrBatch();
>> }
>> 
>> public class ClusterJobParameter {
>> 
>>  private String paramName;
>>  private String paramType = "string";
>>  private String paramDesc;
>>  private String paramDefaultValue;
>>  private Set choices;
>>  private boolean mandatory;
>> }
>> 
>> This really helps to launch a Flink job by a frontend (if the rest
> services
>> returns back those infos).
>> 
>> Best,
>> Flavio
>> 
>> On Fri, Jul 19, 2019 at 9:57 AM Biao Liu 
>> wrote:
>> 
>>> Hi Zili,
>>> 
>>> Thank you for bring us this discussion.
>>> 
>>> My gut feeling is +1 for dropping it.
>>> Usually it costs some time to deprecate a public (actually it's
>>> `PublicEvolving`) API. Ideally it should be marked as `Deprecated`
> first.
>>> Then it might be abandoned it in some later version.
>>> 
>>> I'm not sure how big the burden is to make it compatible with the
>> enhanced
>>> client API. If it's a critical blocker, I support dropping it
 radically
>> in
>>> 1.10. Of course a survey is necessary. And the result of survey is
>>> acceptable.
>>> 
>>> 
>>> 
>>> Zili Chen  于2019年7月19日周五 下午1:44写道:
>>> 
 Hi devs,
 
 Participating the thread "Flink client api enhancement"[1], I
>> just
>> notice
 that inside submission codepath of Flink we always has a branch
> dealing
 with the case that main class of user program is a subclass of
 

Re: Unbearably slow Table API time-windowed stream join with RocksDBStateBackend

2019-08-14 Thread Jark Wu
Hi Xiao,

Thanks for reporting this.
You approach sounds good to me. But we have many similar problems in
existing streaming sql operator implementations.
So I think if State API / statebackend can provide a better state structure
to handle this situation would be great.

This is a similar problem with poor performance of RocksDBListState. And
the relative discussions have been raised several times [1][2].
The root cause is RocsDBStatBackend serialize the whole list as a byte[].
And there were some ideas proposed in the thread.

I cc'ed Yu Li who works on statebackend.

Thanks,
Jark


[1]: https://issues.apache.org/jira/browse/FLINK-8297
[2]:
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discuss-FLINK-8297-A-solution-for-FLINK-8297-Timebased-RocksDBListState-tc28259.html


On Wed, 14 Aug 2019 at 14:46, LIU Xiao  wrote:

> Example SQL:
>
> SELECT *
> FROM stream1 s1, stream2 s2
> WHERE s1.id = s2.id AND s1.rowtime = s2.rowtime
>
> And we have lots of messages in stream1 and stream2 share a same rowtime.
>
> It runs fine when using heap as the state backend,
> but requires lots of heap memory sometimes (when upstream out of sync,
> etc), and a risk of full gc exists.
>
> When we use RocksDBStateBackend to lower the heap memory usage, we found
> our program runs unbearably slow.
>
> After examing the code we found
> org.apache.flink.table.runtime.join.TimeBoundedStreamJoin#processElement1()
> may be the cause of the problem (we are using Flink 1.6 but 1.8 should be
> same):
> ...
> // Check if we need to cache the current row.
> if (rightOperatorTime < rightQualifiedUpperBound) {
>   // Operator time of right stream has not exceeded the upper window
> bound of the current
>   // row. Put it into the left cache, since later coming records from
> the right stream are
>   // expected to be joined with it.
>   var leftRowList = leftCache.get(timeForLeftRow)
>   if (null == leftRowList) {
> leftRowList = new util.ArrayList[JTuple2[Row, Boolean]](1)
>   }
>   leftRowList.add(JTuple2.of(leftRow, emitted))
>   leftCache.put(timeForLeftRow, leftRowList)
> ...
>
> In above code, if there are lots of messages with a same timeForLeftRow,
> the serialization and deserialization cost will be very high when using
> RocksDBStateBackend.
>
> A simple fix I came up with:
> ...
>   // cache to store rows from the left stream
>   //private var leftCache: MapState[Long, JList[JTuple2[Row, Boolean]]] = _
>   private var leftCache: MapState[JTuple2[Long, Integer],
> JList[JTuple2[Row, Boolean]]] = _
>   private var leftCacheSize: MapState[Long, Integer] = _
> ...
> // Check if we need to cache the current row.
> if (rightOperatorTime < rightQualifiedUpperBound) {
>   // Operator time of right stream has not exceeded the upper window
> bound of the current
>   // row. Put it into the left cache, since later coming records from
> the right stream are
>   // expected to be joined with it.
>   //var leftRowList = leftCache.get(timeForLeftRow)
>   //if (null == leftRowList) {
>   //  leftRowList = new util.ArrayList[JTuple2[Row, Boolean]](1)
>   //}
>   //leftRowList.add(JTuple2.of(leftRow, emitted))
>   //leftCache.put(timeForLeftRow, leftRowList)
>   var leftRowListSize = leftCacheSize.get(timeForLeftRow)
>   if (null == leftRowListSize) {
> leftRowListSize = new Integer(0)
>   }
>   leftCache.put(JTuple2.of(timeForLeftRow, leftRowListSize),
> JTuple2.of(leftRow, emitted))
>   leftCacheSize.put(timeForLeftRow, leftRowListSize + 1)
> ...
>
> --
> LIU Xiao 
>
>


why is not possible to handle a custom resource manager ?

2019-08-14 Thread Cristian Lorenzetto
i want develop a project using flink stack in a project using a custom
distributed system, so id like use my distrubuted system as resource
manager instead to overload the project with many other additional sockets
and code.

Is there a way for embedding flink project in my server without using other
external resources manager. For example a way for setting a
ResourceManagerInterface i dont know


[jira] [Created] (FLINK-13710) JarListHandler always extract the jar package

2019-08-14 Thread ChengWei Ye (JIRA)
ChengWei Ye created FLINK-13710:
---

 Summary: JarListHandler always extract the jar package
 Key: FLINK-13710
 URL: https://issues.apache.org/jira/browse/FLINK-13710
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.8.1
Reporter: ChengWei Ye


 
{code:java}
// JarListHandler class
// handleRequest method

for (String clazz : classes) {
   clazz = clazz.trim();

   PackagedProgram program = null;
   try {
  // here
  program = new PackagedProgram(f, clazz, new String[0]);
   } catch (Exception ignored) {
  // ignore jar files which throw an error upon creating a PackagedProgram
   }
   if (program != null) {
  JarListInfo.JarEntryInfo jarEntryInfo = new 
JarListInfo.JarEntryInfo(clazz, program.getDescription());
  jarEntryList.add(jarEntryInfo);
   }
}
{code}
When I open the submit page of the jm web 
([http://localhost:7081/#/submit|http://localhost:8081/#/submit]), the 
background always decompresses the lib directory in the job jar package until 
the temp directory is full.

If the jobmanager just gets the jar information, the submit page should not 
extract the jar package.

And I think the same jar only needs to be decompressed once, and should not be 
decompressed every time it is submitted.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [DISCUSS] FLIP-39: Flink ML pipeline and ML libs

2019-08-14 Thread Robert Metzger
It seems that this FLIP doesn't have a Wiki page yet [1], even though it is
already partially implemented [2]
We should try to stick more to the FLIP process to manage the project more
efficiently.


[1]
https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
[2] https://issues.apache.org/jira/browse/FLINK-12470

On Mon, Jun 17, 2019 at 12:27 PM Gen Luo  wrote:

> Hi all,
>
> In the review of PR for FLINK-12473, there were a few comments regarding
> pipeline exportation. We would like to start a follow up discussions to
> address some related comments.
>
> Currently, FLIP-39 proposal gives a way for users to persist a pipeline in
> JSON format. But it does not specify how users can export a pipeline for
> serving purpose. We summarized some thoughts on this in the following doc.
>
>
> https://docs.google.com/document/d/1B84b-1CvOXtwWQ6_tQyiaHwnSeiRqh-V96Or8uHqCp8/edit?usp=sharing
>
> After we reach consensus on the pipeline exportation, we will add a
> corresponding section in FLIP-39.
>
>
> Shaoxuan Wang  于2019年6月5日周三 上午8:47写道:
>
> > Stavros,
> > They have the similar logic concept, but the implementation details are
> > quite different. It is hard to migrate the interface with different
> > implementations. The built-in algorithms are useful legacy that we will
> > consider migrate to the new API (but still with different
> implementations).
> > BTW, the new API has already been merged via FLINK-12473.
> >
> > Thanks,
> > Shaoxuan
> >
> >
> >
> > On Mon, Jun 3, 2019 at 6:08 PM Stavros Kontopoulos <
> > st.kontopou...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > Some portion of the code could be migrated to the new Table API no?
> > > I am saying that because the new API design is based on scikit-learn
> and
> > > the old one was also inspired by it.
> > >
> > > Best,
> > > Stavros
> > > On Wed, May 22, 2019 at 1:24 PM Shaoxuan Wang 
> > wrote:
> > >
> > > > Another consensus (from the offline discussion) is that we will
> > > > delete/deprecate flink-libraries/flink-ml. I have started a survey
> and
> > > > discussion [1] in dev/user-ml to collect the feedback. Depending on
> the
> > > > replies, we will decide if we shall delete it in Flink1.9 or
> > > > deprecate in the next release after 1.9.
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Usage-of-flink-ml-and-DISCUSS-Delete-flink-ml-td29057.html
> > > >
> > > > Regards,
> > > > Shaoxuan
> > > >
> > > >
> > > > On Tue, May 21, 2019 at 9:22 PM Gen Luo  wrote:
> > > >
> > > > > Yes, this is our conclusion. I'd like to add only one point that
> > > > > registering user defined aggregator is also needed which is
> currently
> > > > > provided by 'bridge' and finally will be merged into Table API.
> It's
> > > same
> > > > > with collect().
> > > > >
> > > > > I will add a TableEnvironment argument in Estimator.fit() and
> > > > > Transformer.transform() to get rid of the dependency on
> > > > > flink-table-planner. This will be committed soon.
> > > > >
> > > > > Aljoscha Krettek  于2019年5月21日周二 下午7:31写道:
> > > > >
> > > > > > We discussed this in private and came to the conclusion that we
> > > should
> > > > > > (for now) have the dependency on flink-table-api-xxx-bridge
> because
> > > we
> > > > > need
> > > > > > access to the collect() method, which is not yet available in the
> > > Table
> > > > > > API. Once that is available the code can be refactored but for
> now
> > we
> > > > > want
> > > > > > to unblock work on this new module.
> > > > > >
> > > > > > We also agreed that we don’t need a direct dependency on
> > > > > > flink-table-planner.
> > > > > >
> > > > > > I hope I summarised our discussion correctly.
> > > > > >
> > > > > > > On 17. May 2019, at 12:20, Gen Luo 
> wrote:
> > > > > > >
> > > > > > > Thanks for your reply.
> > > > > > >
> > > > > > > For the first question, it's not strictly necessary. But I
> perfer
> > > not
> > > > > to
> > > > > > > have a TableEnvironment argument in Estimator.fit() or
> > > > > > > Transformer.transform(), which is not part of machine learning
> > > > concept,
> > > > > > and
> > > > > > > may make our API not as clean and pretty as other systems do. I
> > > would
> > > > > > like
> > > > > > > another way other than introducing flink-table-planner to do
> > this.
> > > If
> > > > > > it's
> > > > > > > impossible or severely opposed, I may make the concession to
> add
> > > the
> > > > > > > argument.
> > > > > > >
> > > > > > > Other than that, "flink-table-api-xxx-bridge"s are still
> needed.
> > A
> > > > vary
> > > > > > > common case is that an algorithm needs to guarantee that it's
> > > running
> > > > > > under
> > > > > > > a BatchTableEnvironment, which makes it possible to collect
> > result
> > > > each
> > > > > > > iteration. A typical algorithm like this is ALS. By flink1.8,
> > this
> > > > can
> > > > > be
> > > > > > > only achieved by converting Table to DataSet than call
> > > > > 

Re: Checkpointing under backpressure

2019-08-14 Thread Paris Carbone
Now I see a little more clearly what you have in mind. Thanks for the 
explanation!
There are a few intermixed concepts here, some how to do with correctness some 
with performance.
Before delving deeper I will just enumerate a few things to make myself a 
little more helpful if I can.

I. Initiation
-

1. RPC to sources only is a less intrusive way to initiate snapshots since you 
utilize better pipeline parallelism (only a small subset of tasks is running 
progressively the protocol at a time, if snapshotting is async the overall 
overhead might not even be observable).

2. If we really want an RPC to all initiation take notice of the following 
implications:

a. (correctness) RPC calls are not guaranteed to arrive in every task 
before a marker from a preceding task.

b. (correctness) Either the RPC call OR the first arriving marker 
should initiate the algorithm. Whichever comes first. If you only do it per RPC 
call then you capture a "late" state that includes side effects of already 
logged events.

c. (performance) Lots of IO will be invoked at the same time on the 
backend store from all tasks. This might lead to high congestion in async 
snapshots.

II. Capturing State First
-

1. (correctness) Capturing state at the last marker sounds incorrect to me 
(state contains side effects of already logged events based on the proposed 
scheme). This results into duplicate processing. No?

III. Channel Blocking / "Alignment"
---

1. (performance?) What is the added benefit? We dont want a "complete" 
transactional snapshot, async snapshots are purely for failure-recovery. Thus, 
I dont see why this needs to be imposed at the expense of 
performance/throughput. With the proposed scheme the whole dataflow anyway 
enters snapshotting/logging mode so tasks more or less snapshot concurrently. 

IV Marker Bypassing
---

1. (correctness) This leads to equivalent in-flight snapshots so with some 
quick thinking  correct. I will try to model this later and get back to you in 
case I find something wrong.

2. (performance) It also sounds like a meaningful optimisation! I like thinking 
of this as a push-based snapshot. i.e., the producing task somehow triggers 
forward a consumer/channel to capture its state. By example consider T1 -> 
|marker t1| -> T2. 

V. Usage of "Async" Snapshots
-

1. Do you see this as a full replacement of "full" aligned 
snapshots/savepoints? In my view async shanpshots will be needed from time to 
time but not as frequently. Yet, it seems like a valid approach solely for 
failure-recovery on the same configuration. Here's why:

a. With original snapshotting there is a strong duality between 
a stream input (offsets) and committed side effects (internal states 
and external commits to transactional sinks). While in the async version, there 
are uncommitted operations (inflight records). Thus, you cannot use these 
snapshots for e.g., submitting sql queries with snapshot isolation. Also, the 
original snapshotting gives a lot of potential for flink to make proper 
transactional commits externally.

b. Reconfiguration is very tricky, you probably know that better. 
Inflight channel state is no longer valid in a new configuration (i.e., new 
dataflow graph, new operators, updated operator logic, different channels, 
different parallelism)

2. Async snapshots can also be potentially useful for monitoring the general 
health of a dataflow since they can be analyzed by the task manager about the 
general performance of a job graph and spot bottlenecks for example.

> On 14 Aug 2019, at 09:08, Piotr Nowojski  wrote:
> 
> Hi,
> 
> Thomas: 
> There are no Jira tickets yet (or maybe there is something very old 
> somewhere). First we want to discuss it, next present FLIP and at last create 
> tickets :)
> 
>> if I understand correctly, then the proposal is to not block any
>> input channel at all, but only log data from the backpressured channel (and
>> make it part of the snapshot) until the barrier arrives
> 
> I would guess that it would be better to block the reads, unless we can 
> already process the records from the blocked channel…
> 
> Paris:
> 
> Thanks for the explanation Paris. I’m starting to understand this more and I 
> like the idea of snapshotting the state of an operator before receiving all 
> of the checkpoint barriers - this would allow more things to happen at the 
> same time instead of sequentially. As Zhijiang has pointed out there are some 
> things not considered in your proposal: overtaking output buffers, but maybe 
> those things could be incorporated together.
> 
> Another thing is that from the wiki description I understood that the initial 
> checkpointing is not initialised by any checkpoint barrier, but by an 
> independent call/message from the Observer. I haven’t played with this idea a 
> lot, but I had 

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-14 Thread Jark Wu
Hi Gordon,

I have verified the following things:

- build the source release with Scala 2.12 and Scala 2.11 successfully
- checked/verified signatures and hashes
- checked that all POM files point to the same version
- ran some flink table related end-to-end tests locally and succeeded
(except TPC-H e2e failed which is reported in FLINK-13704)
- started cluster for both Scala 2.11 and 2.12, ran examples, verified web
ui and log output, nothing unexpected
- started cluster, ran a SQL query to temporal join with kafka source and
mysql jdbc table, and write results to kafka again. Using DDL to create the
source and sinks. looks good.
- reviewed the release PR

As FLINK-13704 is not recognized as blocker issue, so +1 from my side
(non-binding).

On Tue, 13 Aug 2019 at 17:07, Till Rohrmann  wrote:

> Hi Richard,
>
> although I can see that it would be handy for users who have PubSub set up,
> I would rather not include examples which require an external dependency
> into the Flink distribution. I think examples should be self-contained. My
> concern is that we would bloat the distribution for many users at the
> benefit of a few. Instead, I think it would be better to make these
> examples available differently, maybe through Flink's ecosystem website or
> maybe a new examples section in Flink's documentation.
>
> Cheers,
> Till
>
> On Tue, Aug 13, 2019 at 9:43 AM Jark Wu  wrote:
>
> > Hi Till,
> >
> > After thinking about we can use VARCHAR as an alternative of
> > timestamp/time/date.
> > I'm fine with not recognize it as a blocker issue.
> > We can fix it into 1.9.1.
> >
> >
> > Thanks,
> > Jark
> >
> >
> > On Tue, 13 Aug 2019 at 15:10, Richard Deurwaarder 
> wrote:
> >
> > > Hello all,
> > >
> > > I noticed the PubSub example jar is not included in the examples/ dir
> of
> > > flink-dist. I've created
> > https://issues.apache.org/jira/browse/FLINK-13700
> > >  + https://github.com/apache/flink/pull/9424/files to fix this.
> > >
> > > I will leave it up to you to decide if we want to add this to 1.9.0.
> > >
> > > Regards,
> > >
> > > Richard
> > >
> > > On Tue, Aug 13, 2019 at 9:04 AM Till Rohrmann 
> > > wrote:
> > >
> > > > Hi Jark,
> > > >
> > > > thanks for reporting this issue. Could this be a documented
> limitation
> > of
> > > > Blink's preview version? I think we have agreed that the Blink SQL
> > > planner
> > > > will be rather a preview feature than production ready. Hence it
> could
> > > > still contain some bugs. My concern is that there might be still
> other
> > > > issues which we'll discover bit by bit and could postpone the release
> > > even
> > > > further if we say Blink bugs are blockers.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Tue, Aug 13, 2019 at 7:42 AM Jark Wu  wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I just find an issue when testing connector DDLs against blink
> > planner
> > > > for
> > > > > rc2.
> > > > > This issue lead to the DDL doesn't work when containing
> > > > timestamp/date/time
> > > > > type.
> > > > > I have created an issue FLINK-13699[1] and a pull request for this.
> > > > >
> > > > > IMO, this can be a blocker issue of 1.9 release. Because
> > > > > timestamp/date/time are primitive types, and this will break the
> DDL
> > > > > feature.
> > > > > However, I want to hear more thoughts from the community whether we
> > > > should
> > > > > recognize it as a blocker.
> > > > >
> > > > > Thanks,
> > > > > Jark
> > > > >
> > > > >
> > > > > [1]: https://issues.apache.org/jira/browse/FLINK-13699
> > > > >
> > > > >
> > > > >
> > > > > On Mon, 12 Aug 2019 at 22:46, Becket Qin 
> > wrote:
> > > > >
> > > > > > Thanks Gordon, will do that.
> > > > > >
> > > > > > On Mon, Aug 12, 2019 at 4:42 PM Tzu-Li (Gordon) Tai <
> > > > tzuli...@apache.org
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Concerning FLINK-13231:
> > > > > > >
> > > > > > > Since this is a @PublicEvolving interface, technically it is ok
> > to
> > > > > break
> > > > > > > it across releases (including across bugfix releases?).
> > > > > > > So, @Becket if you do merge it now, please mark the fix version
> > as
> > > > > 1.9.1.
> > > > > > >
> > > > > > > During the voting process, in the case a new RC is created, we
> > > > usually
> > > > > > > check the list of changes compared to the previous RC, and
> > correct
> > > > the
> > > > > > "Fix
> > > > > > > Version" of the corresponding JIRAs to be the right version (in
> > the
> > > > > case,
> > > > > > > it would be corrected to 1.9.0 instead of 1.9.1).
> > > > > > >
> > > > > > > On Mon, Aug 12, 2019 at 4:25 PM Till Rohrmann <
> > > trohrm...@apache.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > >> I agree that it would be nicer. Not sure whether we should
> > cancel
> > > > the
> > > > > RC
> > > > > > >> for this issue given that it is open for quite some time and
> > > hasn't
> > > > > been
> > > > > > >> addressed until very recently. Maybe we could include it on
> the
> > > > > > shortlist
> > 

Re: [DISCUSS] Drop stale class Program

2019-08-14 Thread Kostas Kloudas
Hi all,

It is nice to have this discussion.

I am totally up for removing the unused Program interface, as this will
simplify a lot of other code paths in the ClusterClient and elsewhere.

Also about the easier integration of Flink with other frameworks, there
is another discussion in the mailing list with exactly this topic:
[DISCUSS] Flink client api enhancement for downstream project

Cheers,
Kostas


On Tue, Jul 30, 2019 at 1:38 PM Zili Chen  wrote:

> Hi,
>
> With a one-week survey in user list[1], nobody expect Flavio and Jeff
> participant the thread.
>
> Flavio shared his experience with a revised Program like interface.
> This could be regraded as downstream integration and in client api
> enhancements document we propose rich interface for this integration.
> Anyway, the Flink scope Program is less functional than it should be.
>
> With no objection I'd like to push on this thread. We need a committer
> participant this thread to shepherd the removal/deprecation of Program, a
> @PublicEvolving interface. Anybody has spare time? Or anything I can do
> to make progress?
>
> Best,
> tison.
>
> [1]
>
> https://lists.apache.org/thread.html/37445e43729cf7eaeb0aa09133d3980b62f891c5ee69d2c3c3e76ab5@%3Cuser.flink.apache.org%3E
>
>
> Zili Chen  于2019年7月22日周一 下午8:38写道:
>
> > Hi,
> >
> > I created a thread for survey in user list[1]. Please take participate in
> > if interested.
> >
> > Best,
> > tison.
> >
> > [1]
> >
> https://lists.apache.org/thread.html/37445e43729cf7eaeb0aa09133d3980b62f891c5ee69d2c3c3e76ab5@%3Cuser.flink.apache.org%3E
> >
> >
> > Flavio Pompermaier  于2019年7月19日周五 下午5:18写道:
> >
> >> +1 to remove directly the Program class (I think nobody use it and it's
> >> not
> >> supported at all by REST services and UI).
> >> Moreover it requires a lot of transitive dependencies while it should
> be a
> >> very simple thing..
> >> +1 to add this discussion to "Flink client api enhancement"
> >>
> >> On Fri, Jul 19, 2019 at 11:14 AM Biao Liu  wrote:
> >>
> >> > To Flavio, good point for the integration suggestion.
> >> >
> >> > I think it should be considered in the "Flink client api enhancement"
> >> > discussion. But the outdated API should be deprecated somehow.
> >> >
> >> > Flavio Pompermaier  于2019年7月19日周五 下午4:21写道:
> >> >
> >> > > In my experience a basic "official" (but optional) program
> description
> >> > > would be very useful indeed (in order to ease the integration with
> >> other
> >> > > frameworks).
> >> > >
> >> > > Of course it should be extended and integrated with the REST
> services
> >> and
> >> > > the Web UI (when defined) in order to be useful..
> >> > > It ease to show to the user what a job does and which parameters it
> >> > > requires (optional or mandatory) and with a proper help description.
> >> > > Indeed, when we write a Flink job we implement the following
> >> interface:
> >> > >
> >> > > public interface FlinkJob {
> >> > >   String getDescription();
> >> > >   List getParameters();
> >> > >  boolean isStreamingOrBatch();
> >> > > }
> >> > >
> >> > > public class ClusterJobParameter {
> >> > >
> >> > >   private String paramName;
> >> > >   private String paramType = "string";
> >> > >   private String paramDesc;
> >> > >   private String paramDefaultValue;
> >> > >   private Set choices;
> >> > >   private boolean mandatory;
> >> > > }
> >> > >
> >> > > This really helps to launch a Flink job by a frontend (if the rest
> >> > services
> >> > > returns back those infos).
> >> > >
> >> > > Best,
> >> > > Flavio
> >> > >
> >> > > On Fri, Jul 19, 2019 at 9:57 AM Biao Liu 
> wrote:
> >> > >
> >> > > > Hi Zili,
> >> > > >
> >> > > > Thank you for bring us this discussion.
> >> > > >
> >> > > > My gut feeling is +1 for dropping it.
> >> > > > Usually it costs some time to deprecate a public (actually it's
> >> > > > `PublicEvolving`) API. Ideally it should be marked as `Deprecated`
> >> > first.
> >> > > > Then it might be abandoned it in some later version.
> >> > > >
> >> > > > I'm not sure how big the burden is to make it compatible with the
> >> > > enhanced
> >> > > > client API. If it's a critical blocker, I support dropping it
> >> radically
> >> > > in
> >> > > > 1.10. Of course a survey is necessary. And the result of survey is
> >> > > > acceptable.
> >> > > >
> >> > > >
> >> > > >
> >> > > > Zili Chen  于2019年7月19日周五 下午1:44写道:
> >> > > >
> >> > > > > Hi devs,
> >> > > > >
> >> > > > > Participating the thread "Flink client api enhancement"[1], I
> just
> >> > > notice
> >> > > > > that inside submission codepath of Flink we always has a branch
> >> > dealing
> >> > > > > with the case that main class of user program is a subclass of
> >> > > > > o.a.f.api.common.Program, which is defined as
> >> > > > >
> >> > > > > @PublicEvolving
> >> > > > > public interface Program {
> >> > > > >   Plan getPhan(String... args);
> >> > > > > }
> >> > > > >
> >> > > > > This class, as user-facing interface, asks user to implement
> >> #getPlan
> >> > > > > which return a 

Re: Checkpointing under backpressure

2019-08-14 Thread Zhu Zhu
Thanks Piotr and Zhijiang for sharing the thoughts on unaligned
checkpointing and the barrier overtaking.

I have a question about 2.d) in Piotr's last mail that states "the Task
first has to process the buffered data after that it can unblock the reads
from the channels".
Does this mean that we do not want checkpoint 2 to happen if the data
before checkpoint 1 barriers are not fully processed, as the barriers of
checkpoint 2 can be blocked?
If so, the checkpointing may get blocked once the job is overloaded.
If not, a overloaded job may lead to a larger and larger checkpoint.

And I agree a slot with point 4 in Piotr's last mail and think it's
necessary for a CP1 snapshotted task to process CP1 buffered data before
all CP1 barriers are are received in this task.
Otherwise it might be a processing performance regression compared to
current exactly-once checkpointing.

Thanks,
Zhu Zhu

Piotr Nowojski  于2019年8月14日周三 下午3:08写道:

> Hi,
>
> Thomas:
> There are no Jira tickets yet (or maybe there is something very old
> somewhere). First we want to discuss it, next present FLIP and at last
> create tickets :)
>
> > if I understand correctly, then the proposal is to not block any
> > input channel at all, but only log data from the backpressured channel
> (and
> > make it part of the snapshot) until the barrier arrives
>
> I would guess that it would be better to block the reads, unless we can
> already process the records from the blocked channel…
>
> Paris:
>
> Thanks for the explanation Paris. I’m starting to understand this more and
> I like the idea of snapshotting the state of an operator before receiving
> all of the checkpoint barriers - this would allow more things to happen at
> the same time instead of sequentially. As Zhijiang has pointed out there
> are some things not considered in your proposal: overtaking output buffers,
> but maybe those things could be incorporated together.
>
> Another thing is that from the wiki description I understood that the
> initial checkpointing is not initialised by any checkpoint barrier, but by
> an independent call/message from the Observer. I haven’t played with this
> idea a lot, but I had some discussion with Nico and it seems that it might
> work:
>
> 1. JobManager sends and RPC “start checkpoint” to all tasks
> 2. Task (with two input channels l1 and l2) upon receiving RPC from 1.,
> takes a snapshot of it's state and:
>   a) broadcast checkpoint barrier down the stream to all channels (let’s
> ignore for a moment potential for this barrier to overtake the buffer
> output data)
>   b) for any input channel for which it hasn’t yet received checkpoint
> barrier, the data are being added to the checkpoint
>   c) once a channel (for example l1) receives a checkpoint barrier, the
> Task blocks reads from that channel (?)
>   d) after all remaining channels (l2) receive checkpoint barriers, the
> Task  first has to process the buffered data after that it can unblock the
> reads from the channels
>
> Checkpoint barriers do not cascade/flow through different tasks here.
> Checkpoint barrier emitted from Task1, reaches only the immediate
> downstream Tasks. Thanks to this setup, total checkpointing time is not sum
> of checkpointing times of all Tasks one by one, but more or less max of the
> slowest Tasks. Right?
>
> Couple of intriguing thoughts are:
>  3. checkpoint barriers overtaking the output buffers
>  4. can we keep processing some data (in order to not waste CPU cycles)
> after we have taking the snapshot of the Task. I think we could.
>
> Piotrek
>
> > On 14 Aug 2019, at 06:00, Thomas Weise  wrote:
> >
> > Great discussion! I'm excited that this is already under consideration!
> Are
> > there any JIRAs or other traces of discussion to follow?
> >
> > Paris, if I understand correctly, then the proposal is to not block any
> > input channel at all, but only log data from the backpressured channel
> (and
> > make it part of the snapshot) until the barrier arrives? This is
> > intriguing. But probably there is also a benefit of to not continue
> reading
> > I1 since that could speed up retrieval from I2. Also, if the user code is
> > the cause of backpressure, this would avoid pumping more data into the
> > process function.
> >
> > Thanks,
> > Thomas
> >
> >
> > On Tue, Aug 13, 2019 at 8:02 AM zhijiang  .invalid>
> > wrote:
> >
> >> Hi Paris,
> >>
> >> Thanks for the detailed sharing. And I think it is very similar with the
> >> way of overtaking we proposed before.
> >>
> >> There are some tiny difference:
> >> The way of overtaking might need to snapshot all the input/output
> queues.
> >> Chandy Lamport seems only need to snaphost (n-1) input channels after
> the
> >> first barrier arrives, which might reduce the state sizea bit. But
> normally
> >> there should be less buffers for the first input channel with barrier.
> >> The output barrier still follows with regular data stream in Chandy
> >> Lamport, the same way as current flink. For overtaking way, we 

Re: Launching Flink server from IntelliJ

2019-08-14 Thread Till Rohrmann
Hi Ramayan,

you can start a random example [1] from the IDE. The local execution
environment will start a MiniCluster to execute the Flink job in the
process started by the IDE.

[1] https://github.com/apache/flink/tree/master/flink-examples

Cheers,
Till

On Wed, Aug 14, 2019 at 12:20 AM Ramayan Tiwari 
wrote:

> Hello,
>
> Is there any documentation around launching Flink service from Intellij (VM
> args, Main class etc)? I have found the documentation on remote debugging
> <
> https://cwiki.apache.org/confluence/display/FLINK/Remote+Debugging+of+Flink+Clusters
> >,
> but couldn't find steps to launch Flink from source, any pointers?
>
> I am new to Flink and trying to launch Flink in debug mode to be able to
> step through some of server side code for understanding the internals. Any
> resource to get started would be appreciated.
>
> Thanks
> Ramayan
>


[jira] [Created] (FLINK-13709) Kafka09ITCase hangs when starting the KafkaServer on Travis

2019-08-14 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-13709:
-

 Summary: Kafka09ITCase hangs when starting the KafkaServer on 
Travis
 Key: FLINK-13709
 URL: https://issues.apache.org/jira/browse/FLINK-13709
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka, Tests
Affects Versions: 1.9.0, 1.10.0
Reporter: Till Rohrmann


The {{Kafka09ITCase}} hangs when starting the {{KafkaServer}} on Travis.

https://api.travis-ci.org/v3/job/571295948/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: Checkpointing under backpressure

2019-08-14 Thread Piotr Nowojski
Hi,

Thomas: 
There are no Jira tickets yet (or maybe there is something very old somewhere). 
First we want to discuss it, next present FLIP and at last create tickets :)

> if I understand correctly, then the proposal is to not block any
> input channel at all, but only log data from the backpressured channel (and
> make it part of the snapshot) until the barrier arrives

I would guess that it would be better to block the reads, unless we can already 
process the records from the blocked channel…

Paris:

Thanks for the explanation Paris. I’m starting to understand this more and I 
like the idea of snapshotting the state of an operator before receiving all of 
the checkpoint barriers - this would allow more things to happen at the same 
time instead of sequentially. As Zhijiang has pointed out there are some things 
not considered in your proposal: overtaking output buffers, but maybe those 
things could be incorporated together.

Another thing is that from the wiki description I understood that the initial 
checkpointing is not initialised by any checkpoint barrier, but by an 
independent call/message from the Observer. I haven’t played with this idea a 
lot, but I had some discussion with Nico and it seems that it might work:

1. JobManager sends and RPC “start checkpoint” to all tasks
2. Task (with two input channels l1 and l2) upon receiving RPC from 1., takes a 
snapshot of it's state and:
  a) broadcast checkpoint barrier down the stream to all channels (let’s ignore 
for a moment potential for this barrier to overtake the buffer output data)
  b) for any input channel for which it hasn’t yet received checkpoint barrier, 
the data are being added to the checkpoint
  c) once a channel (for example l1) receives a checkpoint barrier, the Task 
blocks reads from that channel (?)
  d) after all remaining channels (l2) receive checkpoint barriers, the Task  
first has to process the buffered data after that it can unblock the reads from 
the channels

Checkpoint barriers do not cascade/flow through different tasks here. 
Checkpoint barrier emitted from Task1, reaches only the immediate downstream 
Tasks. Thanks to this setup, total checkpointing time is not sum of 
checkpointing times of all Tasks one by one, but more or less max of the 
slowest Tasks. Right?

Couple of intriguing thoughts are:
 3. checkpoint barriers overtaking the output buffers
 4. can we keep processing some data (in order to not waste CPU cycles) after 
we have taking the snapshot of the Task. I think we could.

Piotrek

> On 14 Aug 2019, at 06:00, Thomas Weise  wrote:
> 
> Great discussion! I'm excited that this is already under consideration! Are
> there any JIRAs or other traces of discussion to follow?
> 
> Paris, if I understand correctly, then the proposal is to not block any
> input channel at all, but only log data from the backpressured channel (and
> make it part of the snapshot) until the barrier arrives? This is
> intriguing. But probably there is also a benefit of to not continue reading
> I1 since that could speed up retrieval from I2. Also, if the user code is
> the cause of backpressure, this would avoid pumping more data into the
> process function.
> 
> Thanks,
> Thomas
> 
> 
> On Tue, Aug 13, 2019 at 8:02 AM zhijiang 
> wrote:
> 
>> Hi Paris,
>> 
>> Thanks for the detailed sharing. And I think it is very similar with the
>> way of overtaking we proposed before.
>> 
>> There are some tiny difference:
>> The way of overtaking might need to snapshot all the input/output queues.
>> Chandy Lamport seems only need to snaphost (n-1) input channels after the
>> first barrier arrives, which might reduce the state sizea bit. But normally
>> there should be less buffers for the first input channel with barrier.
>> The output barrier still follows with regular data stream in Chandy
>> Lamport, the same way as current flink. For overtaking way, we need to pay
>> extra efforts to make barrier transport firstly before outque queue on
>> upstream side, and change the way of barrier alignment based on receiving
>> instead of current reading on downstream side.
>> In the backpressure caused by data skew, the first barrier in almost empty
>> input channel should arrive much eariler than the last heavy load input
>> channel, so the Chandy Lamport could benefit well. But for the case of all
>> balanced heavy load input channels, I mean the first arrived barrier might
>> still take much time, then the overtaking way could still fit well to speed
>> up checkpoint.
>> Anyway, your proposed suggestion is helpful on my side, especially
>> considering some implementation details .
>> 
>> Best,
>> Zhijiang
>> --
>> From:Paris Carbone 
>> Send Time:2019年8月13日(星期二) 14:03
>> To:dev 
>> Cc:zhijiang 
>> Subject:Re: Checkpointing under backpressure
>> 
>> yes! It’s quite similar I think.  Though mind that the devil is in the
>> details, i.e., the temporal order actions are taken.
>> 

[jira] [Created] (FLINK-13708) transformations should be cleared because a table environment could execute multiple job

2019-08-14 Thread godfrey he (JIRA)
godfrey he created FLINK-13708:
--

 Summary: transformations should be cleared because a table 
environment could execute multiple job
 Key: FLINK-13708
 URL: https://issues.apache.org/jira/browse/FLINK-13708
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Reporter: godfrey he
 Fix For: 1.9.0


currently, if a table environment execute more than one sql jobs, the following 
job contains transformations about the previous job. the reason is the 
transformations is not cleared after execution



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: Checkpointing under backpressure

2019-08-14 Thread zhijiang
Hi Thomas,

There are no Jira tickets or discussions at the moment. If there are any 
updates I would ping you.

I agree that there is a benefit to firstly read other channels without barrier 
in high priority, otherwise it seems
waste cpu resource to migrate blocked buffers to another cached queue. Actually 
we ever implemented 
this improvement in Alibaba's private branch and it was running long time in 
real prodcution. It indeeds
speed up barrier alignment a bit if the blocked input channels have some 
buffers.

Best,
Zhijiang
--
From:Thomas Weise 
Send Time:2019年8月14日(星期三) 06:00
To:dev ; zhijiang 
Cc:Paris Carbone 
Subject:Re: Checkpointing under backpressure

Great discussion! I'm excited that this is already under consideration! Are
there any JIRAs or other traces of discussion to follow?

Paris, if I understand correctly, then the proposal is to not block any
input channel at all, but only log data from the backpressured channel (and
make it part of the snapshot) until the barrier arrives? This is
intriguing. But probably there is also a benefit of to not continue reading
I1 since that could speed up retrieval from I2. Also, if the user code is
the cause of backpressure, this would avoid pumping more data into the
process function.

Thanks,
Thomas


On Tue, Aug 13, 2019 at 8:02 AM zhijiang 
wrote:

> Hi Paris,
>
> Thanks for the detailed sharing. And I think it is very similar with the
> way of overtaking we proposed before.
>
> There are some tiny difference:
> The way of overtaking might need to snapshot all the input/output queues.
> Chandy Lamport seems only need to snaphost (n-1) input channels after the
> first barrier arrives, which might reduce the state sizea bit. But normally
> there should be less buffers for the first input channel with barrier.
> The output barrier still follows with regular data stream in Chandy
> Lamport, the same way as current flink. For overtaking way, we need to pay
> extra efforts to make barrier transport firstly before outque queue on
> upstream side, and change the way of barrier alignment based on receiving
> instead of current reading on downstream side.
> In the backpressure caused by data skew, the first barrier in almost empty
> input channel should arrive much eariler than the last heavy load input
> channel, so the Chandy Lamport could benefit well. But for the case of all
> balanced heavy load input channels, I mean the first arrived barrier might
> still take much time, then the overtaking way could still fit well to speed
> up checkpoint.
> Anyway, your proposed suggestion is helpful on my side, especially
> considering some implementation details .
>
> Best,
> Zhijiang
> --
> From:Paris Carbone 
> Send Time:2019年8月13日(星期二) 14:03
> To:dev 
> Cc:zhijiang 
> Subject:Re: Checkpointing under backpressure
>
> yes! It’s quite similar I think.  Though mind that the devil is in the
> details, i.e., the temporal order actions are taken.
>
> To clarify, let us say you have a task T with two input channels I1 and I2.
> The Chandy Lamport execution flow is the following:
>
> 1) T receives barrier from  I1 and...
> 2)  ...the following three actions happen atomically
>  I )  T snapshots its state T*
>  II)  T forwards marker to its outputs
>  III) T starts logging all events of I2 (only) into a buffer M*
> - Also notice here that T does NOT block I1 as it does in aligned
> snapshots -
> 3) Eventually T receives barrier from I2 and stops recording events. Its
> asynchronously captured snapshot is now complete: {T*,M*}.
> Upon recovery all messages of M* should be replayed in FIFO order.
>
> With this approach alignment does not create a deadlock situation since
> anyway 2.II happens asynchronously and messages can be logged as well
> asynchronously during the process of the snapshot. If there is
> back-pressure in a pipeline the cause is most probably not this algorithm.
>
> Back to your observation, the answer : yes and no.  In your network model,
> I can see the logic of “logging” and “committing” a final snapshot being
> provided by the channel implementation. However, do mind that the first
> barrier always needs to go “all the way” to initiate the Chandy Lamport
> algorithm logic.
>
> The above flow has been proven using temporal logic in my phd thesis in
> case you are interested about the proof.
> I hope this helps a little clarifying things. Let me know if there is any
> confusing point to disambiguate. I would be more than happy to help if I
> can.
>
> Paris
>
> > On 13 Aug 2019, at 13:28, Piotr Nowojski  wrote:
> >
> > Thanks for the input. Regarding the Chandy-Lamport snapshots don’t you
> still have to wait for the “checkpoint barrier” to arrive in order to know
> when have you already received all possible messages from the upstream
> tasks/operators? So instead of processing the “in flight” messages (as the
> Flink is 

Unbearably slow Table API time-windowed stream join with RocksDBStateBackend

2019-08-14 Thread LIU Xiao
Example SQL:

SELECT *
FROM stream1 s1, stream2 s2
WHERE s1.id = s2.id AND s1.rowtime = s2.rowtime

And we have lots of messages in stream1 and stream2 share a same rowtime.

It runs fine when using heap as the state backend,
but requires lots of heap memory sometimes (when upstream out of sync, etc), 
and a risk of full gc exists.

When we use RocksDBStateBackend to lower the heap memory usage, we found our 
program runs unbearably slow.

After examing the code we found
org.apache.flink.table.runtime.join.TimeBoundedStreamJoin#processElement1()
may be the cause of the problem (we are using Flink 1.6 but 1.8 should be same):
...
// Check if we need to cache the current row.
if (rightOperatorTime < rightQualifiedUpperBound) {
  // Operator time of right stream has not exceeded the upper window bound 
of the current
  // row. Put it into the left cache, since later coming records from the 
right stream are
  // expected to be joined with it.
  var leftRowList = leftCache.get(timeForLeftRow)
  if (null == leftRowList) {
leftRowList = new util.ArrayList[JTuple2[Row, Boolean]](1)
  }
  leftRowList.add(JTuple2.of(leftRow, emitted))
  leftCache.put(timeForLeftRow, leftRowList)
...

In above code, if there are lots of messages with a same timeForLeftRow,
the serialization and deserialization cost will be very high when using 
RocksDBStateBackend.

A simple fix I came up with:
...
  // cache to store rows from the left stream
  //private var leftCache: MapState[Long, JList[JTuple2[Row, Boolean]]] = _
  private var leftCache: MapState[JTuple2[Long, Integer], JList[JTuple2[Row, 
Boolean]]] = _
  private var leftCacheSize: MapState[Long, Integer] = _
...
// Check if we need to cache the current row.
if (rightOperatorTime < rightQualifiedUpperBound) {
  // Operator time of right stream has not exceeded the upper window bound 
of the current
  // row. Put it into the left cache, since later coming records from the 
right stream are
  // expected to be joined with it.
  //var leftRowList = leftCache.get(timeForLeftRow)
  //if (null == leftRowList) {
  //  leftRowList = new util.ArrayList[JTuple2[Row, Boolean]](1)
  //}
  //leftRowList.add(JTuple2.of(leftRow, emitted))
  //leftCache.put(timeForLeftRow, leftRowList)
  var leftRowListSize = leftCacheSize.get(timeForLeftRow)
  if (null == leftRowListSize) {
leftRowListSize = new Integer(0)
  }
  leftCache.put(JTuple2.of(timeForLeftRow, leftRowListSize), 
JTuple2.of(leftRow, emitted))
  leftCacheSize.put(timeForLeftRow, leftRowListSize + 1)
...

-- 
LIU Xiao