[jira] [Created] (FLINK-25205) Optimize SinkUpsertMaterializer

2021-12-06 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-25205:


 Summary: Optimize SinkUpsertMaterializer
 Key: FLINK-25205
 URL: https://issues.apache.org/jira/browse/FLINK-25205
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Runtime
Reporter: Jingsong Lee


SinkUpsertMaterializer maintains incoming records in state corresponding to the 
upsert keys and generates an upsert view for the downstream operator.

It is intended to solve the messy order problem caused by the upstream 
computation, but it stores the data in the state, which will get bigger and 
bigger.

If we can think that the disorder only occurs within the checkpoint, we can 
consider cleaning up the state of each checkpoint, which can control the size 
of the state.

We can consider adding an optimized config option first.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25204) Spell one more as in the DataStream interval join docs

2021-12-06 Thread tartarus (Jira)
tartarus created FLINK-25204:


 Summary: Spell one more as in the DataStream interval join docs
 Key: FLINK-25204
 URL: https://issues.apache.org/jira/browse/FLINK-25204
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.14.0
Reporter: tartarus


Spell one more as in the DataStream interval join document
Both the lower and upper bound can be either negative or positive as long as 
{color:#DE350B}as{color} the lower bound is always smaller or equal to the 
upper bound.
=>
Both the lower and upper bound can be either negative or positive as long as 
the lower bound is always smaller or equal to the upper bound.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs

2021-12-06 Thread wenlong.lwl
Maybe we can add support for this case :
when an exec node is changed in 1.16, but is compatible with 1.15,
we can use the node of 1.16 to deserialize the plan of 1.15.
By this way, we don't need to fork the code if the change is compatible,
and can avoid fork code frequently.


Best,
Wenlong


On Tue, 7 Dec 2021 at 15:08, wenlong.lwl  wrote:

> hi, Timo, I would prefer to update the version every time we change the
> state layer too.
>
> It could be possible that we change the exec node in 2 steps:
> First,  we add a newStateLayout because of some improvement in state, in
> order to keep compatibility we may still keep the old state for the first
> version. We need to update the version, so that we can generate a new
> version plan for the new job and keep the exec node compatible with the old
> version plan.
> After some versions, we may remove the old version state layout and clean
> up the deprecated code. We still need to update the version, so that we can
> verify that we are compatible with the plan after the first change, but not
> compatible with the plan earlier.
>
>
> Best,
> Wenlong
>
> On Mon, 6 Dec 2021 at 21:27, Timo Walther  wrote:
>
>> Hi Godfrey,
>>
>>  > design makes thing more complex.
>>
>> Yes, the design might be a bit more complex. But operator migration is
>> way easier than ExecNode migration at a later point in time for code
>> maintenance. We know that ExecNodes can become pretty complex. Even
>> though we have put a lot of code into `CommonXXExecNode` it will be a
>> lot of work to maintain multiple versions of ExecNodes. If we can avoid
>> this with operator state migration, this should always be preferred over
>> a new ExecNode version.
>>
>> I'm aware that operator state migration might only be important for
>> roughly 10 % of all changes. A new ExecNode version will be used for 90%
>> of all changes.
>>
>>  > If there are multiple state layouts, which layout the ExecNode should
>> use?
>>
>> It is not the responsibility of the ExecNode to decide this but the
>> operator. Something like:
>>
>> class X extends ProcessFunction {
>>ValueState oldStateLayout;
>>ValueState newStateLayout;
>>
>>open() {
>>  if (oldStateLayout.get() != null) {
>>performOperatorMigration();
>>  }
>>  useNewStateLayout();
>>}
>> }
>>
>> Operator migration is meant for smaller "more local" changes without
>> touching the ExecNode layer. The CEP library and DataStream API sources
>> are performing operator migration for years already.
>>
>>
>>  > `supportedPlanChanges ` and `supportedSavepointChanges ` are a bit
>> obscure.
>>
>> Let me try to come up with more examples why I think both annotation
>> make sense and are esp. important *for test coverage*.
>>
>> supportedPlanChanges:
>>
>> Let's assume we have some JSON in Flink 1.15:
>>
>> {
>>some-prop: 42
>> }
>>
>> And we want to extend the JSON in Flink 1.16:
>>
>> {
>>some-prop: 42,
>>some-flag: false
>> }
>>
>> Maybe we don't need to increase the ExecNode version but only ensure
>> that the flag is set to `false` by default for the older versions.
>>
>> We need a location to track changes and document the changelog. With the
>> help of the annotation supportedPlanChanges = [1.15, 1.16] we can verify
>> that we have tests for both JSON formats.
>>
>> And once we decide to drop the 1.15 format, we enforce plan migration
>> and fill-in the default value `false` into the old plans and bump their
>> JSON plan version to 1.16 or higher.
>>
>>
>>
>>  > once the state layout is changed, the ExecNode version needs also be
>> updated
>>
>> This will still be the majority of cases. But if we can avoid this, we
>> should do it for not having too much duplicate code to maintain.
>>
>>
>>
>> Thanks,
>> Timo
>>
>>
>> On 06.12.21 09:58, godfrey he wrote:
>> > Hi, Timo,
>> >
>> > Thanks for the detailed explanation.
>> >
>> >> We change an operator state of B in Flink 1.16. We perform the change
>> in the operator of B in a way to support both state layouts. Thus, no need
>> for a new ExecNode version.
>> >
>> > I think this design makes thing more complex.
>> > 1. If there are multiple state layouts, which layout the ExecNode
>> should use ?
>> > It increases the cost of understanding for developers (especially for
>> > Flink newer),
>> > making them prone to mistakes.
>> > 2. `supportedPlanChanges ` and `supportedSavepointChanges ` are a bit
>> obscure.
>> >
>> > The purpose of ExecNode annotations are not only to support powerful
>> validation,
>> > but more importantly to make it easy for developers to understand
>> > to ensure that every modification is easy and state compatible.
>> >
>> > I prefer, once the state layout is changed, the ExecNode version needs
>> > also be updated.
>> > which could make thing simple. How about
>> > rename `supportedPlanChanges ` to `planCompatibleVersion`
>> > (which means the plan is compatible with the plan generated by the
>> > given version node)
>> >   and rename 

Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs

2021-12-06 Thread wenlong.lwl
hi, Timo, I would prefer to update the version every time we change the
state layer too.

It could be possible that we change the exec node in 2 steps:
First,  we add a newStateLayout because of some improvement in state, in
order to keep compatibility we may still keep the old state for the first
version. We need to update the version, so that we can generate a new
version plan for the new job and keep the exec node compatible with the old
version plan.
After some versions, we may remove the old version state layout and clean
up the deprecated code. We still need to update the version, so that we can
verify that we are compatible with the plan after the first change, but not
compatible with the plan earlier.


Best,
Wenlong

On Mon, 6 Dec 2021 at 21:27, Timo Walther  wrote:

> Hi Godfrey,
>
>  > design makes thing more complex.
>
> Yes, the design might be a bit more complex. But operator migration is
> way easier than ExecNode migration at a later point in time for code
> maintenance. We know that ExecNodes can become pretty complex. Even
> though we have put a lot of code into `CommonXXExecNode` it will be a
> lot of work to maintain multiple versions of ExecNodes. If we can avoid
> this with operator state migration, this should always be preferred over
> a new ExecNode version.
>
> I'm aware that operator state migration might only be important for
> roughly 10 % of all changes. A new ExecNode version will be used for 90%
> of all changes.
>
>  > If there are multiple state layouts, which layout the ExecNode should
> use?
>
> It is not the responsibility of the ExecNode to decide this but the
> operator. Something like:
>
> class X extends ProcessFunction {
>ValueState oldStateLayout;
>ValueState newStateLayout;
>
>open() {
>  if (oldStateLayout.get() != null) {
>performOperatorMigration();
>  }
>  useNewStateLayout();
>}
> }
>
> Operator migration is meant for smaller "more local" changes without
> touching the ExecNode layer. The CEP library and DataStream API sources
> are performing operator migration for years already.
>
>
>  > `supportedPlanChanges ` and `supportedSavepointChanges ` are a bit
> obscure.
>
> Let me try to come up with more examples why I think both annotation
> make sense and are esp. important *for test coverage*.
>
> supportedPlanChanges:
>
> Let's assume we have some JSON in Flink 1.15:
>
> {
>some-prop: 42
> }
>
> And we want to extend the JSON in Flink 1.16:
>
> {
>some-prop: 42,
>some-flag: false
> }
>
> Maybe we don't need to increase the ExecNode version but only ensure
> that the flag is set to `false` by default for the older versions.
>
> We need a location to track changes and document the changelog. With the
> help of the annotation supportedPlanChanges = [1.15, 1.16] we can verify
> that we have tests for both JSON formats.
>
> And once we decide to drop the 1.15 format, we enforce plan migration
> and fill-in the default value `false` into the old plans and bump their
> JSON plan version to 1.16 or higher.
>
>
>
>  > once the state layout is changed, the ExecNode version needs also be
> updated
>
> This will still be the majority of cases. But if we can avoid this, we
> should do it for not having too much duplicate code to maintain.
>
>
>
> Thanks,
> Timo
>
>
> On 06.12.21 09:58, godfrey he wrote:
> > Hi, Timo,
> >
> > Thanks for the detailed explanation.
> >
> >> We change an operator state of B in Flink 1.16. We perform the change
> in the operator of B in a way to support both state layouts. Thus, no need
> for a new ExecNode version.
> >
> > I think this design makes thing more complex.
> > 1. If there are multiple state layouts, which layout the ExecNode should
> use ?
> > It increases the cost of understanding for developers (especially for
> > Flink newer),
> > making them prone to mistakes.
> > 2. `supportedPlanChanges ` and `supportedSavepointChanges ` are a bit
> obscure.
> >
> > The purpose of ExecNode annotations are not only to support powerful
> validation,
> > but more importantly to make it easy for developers to understand
> > to ensure that every modification is easy and state compatible.
> >
> > I prefer, once the state layout is changed, the ExecNode version needs
> > also be updated.
> > which could make thing simple. How about
> > rename `supportedPlanChanges ` to `planCompatibleVersion`
> > (which means the plan is compatible with the plan generated by the
> > given version node)
> >   and rename `supportedSavepointChanges` to `savepointCompatibleVersion `
> > (which means the state is compatible with the state generated by the
> > given version node) ?
> > The names also indicate that only one version value can be set.
> >
> > WDYT?
> >
> > Best,
> > Godfrey
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Timo Walther  于2021年12月2日周四 下午11:42写道:
> >>
> >> Response to Marios's feedback:
> >>
> >>   > there should be some good logging in place when the upgrade is
> taking
> >> place
> >>
> >> Yes, I 

Re: [DISCUSS] Drop Zookeeper 3.4

2021-12-06 Thread Yang Wang
FYI:

We(Alibaba) are widely using ZooKeeper 3.5.5 for all the YARN and some K8s
Flink high-available applications.


Best,
Yang

Chesnay Schepler  于2021年12月7日周二 上午2:22写道:

> Current users of ZK 3.4 and below would need to upgrade their Zookeeper
> installation that is used by Flink to 3.5+.
>
> Whether K8s users are affected depends on whether they use ZK or not. If
> they do, see above, otherwise they are not affected at all.
>
> On 06/12/2021 18:49, Arvid Heise wrote:
> > Could someone please help me understand the implications of the upgrade?
> >
> > As far as I understood this upgrade would only affect users that have
> > a zookeeper shared across multiple services, some of which require ZK
> > 3.4-? A workaround for those users would be to run two ZKs with
> > different versions, eventually deprecating old ZK, correct?
> >
> > If that is the only limitation, I'm +1 for the proposal since ZK 3.4
> > is already EOL.
> >
> > How are K8s users affected?
> >
> > Best,
> >
> > Arvid
> >
> > On Mon, Dec 6, 2021 at 2:00 PM Chesnay Schepler 
> > wrote:
> >
> > ping @users; any input on how this would affect you is highly
> > appreciated.
> >
> > On 25/11/2021 22:39, Chesnay Schepler wrote:
> > > I included the user ML in the thread.
> > >
> > > @users Are you still using Zookeeper 3.4? If so, were you
> > planning to
> > > upgrade Zookeeper in the near future?
> > >
> > > I'm not sure about ZK compatibility, but we'd also upgrade
> > Curator to
> > > 5.x, which doesn't support ookeeperK 3.4 anymore.
> > >
> > > On 25/11/2021 21:56, Till Rohrmann wrote:
> > >> Should we ask on the user mailing list whether anybody is still
> > using
> > >> ZooKeeper 3.4 and thus needs support for this version or can a
> > ZooKeeper
> > >> 3.5/3.6 client talk to a ZooKeeper 3.4 cluster? I would expect
> > that
> > >> not a
> > >> lot of users depend on it but just to make sure that we aren't
> > >> annoying a
> > >> lot of our users with this change. Apart from that +1 for
> > removing it if
> > >> not a lot of user depend on it.
> > >>
> > >> Cheers,
> > >> Till
> > >>
> > >> On Wed, Nov 24, 2021 at 11:03 AM Matthias Pohl
> > 
> > >> wrote:
> > >>
> > >>> Thanks for starting this discussion, Chesnay. +1 from my side.
> > It's
> > >>> time to
> > >>> move forward with the ZK support considering the EOL of 3.4
> > you already
> > >>> mentioned. The benefits we gain from upgrading Curator to 5.x as
> a
> > >>> consequence is another plus point. Just for reference on the
> > >>> inconsistent
> > >>> state issue you mentioned: FLINK-24543 [1].
> > >>>
> > >>> Matthias
> > >>>
> > >>> [1] https://issues.apache.org/jira/browse/FLINK-24543
> > >>>
> > >>> On Wed, Nov 24, 2021 at 10:19 AM Chesnay Schepler
> > 
> > >>> wrote:
> > >>>
> >  Hello,
> > 
> >  I'd like to drop support for Zookeeper 3.4 in 1.15, upgrading
> the
> >  default to 3.5 with an opt-in for 3.6.
> > 
> >  Supporting Zookeeper 3.4 (which is already EOL) prevents us from
> >  upgrading Curator to 5.x, which would allow us to properly
> > fix an
> >  issue
> >  with inconsistent state. It is also required to eventually
> > support ZK
> > >>> 3.6.
> > >
> > >
> >
>


Re: [VOTE] FLIP-194: Introduce the JobResultStore

2021-12-06 Thread Zhu Zhu
+1 (binding)

Thanks,
Zhu

Yang Wang  于2021年12月7日周二 上午9:52写道:

> +1 (non-binding)
>
> Best,
> Yang
>
> Xintong Song  于2021年12月7日周二 上午9:38写道:
>
> > +1 (binding)
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Mon, Dec 6, 2021 at 5:02 PM Till Rohrmann 
> wrote:
> >
> > > +1 (binding)
> > >
> > > Cheers,
> > > Till
> > >
> > > On Mon, Dec 6, 2021 at 10:00 AM David Morávek  wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > I'd like to open a vote on FLIP-194: Introduce the JobResultStore [1]
> > > which
> > > > has been
> > > > discussed in this thread [2].
> > > > The vote will be open for at least 72 hours unless there is an
> > objection
> > > or
> > > > not enough votes.
> > > >
> > > > [1]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726435
> > > > [2] https://lists.apache.org/thread/wlzv02jqtq221kb8dnm82v4xj8tomd94
> > > >
> > > > Best,
> > > > D.
> > > >
> > >
> >
>


Re: [VOTE] FLIP-194: Introduce the JobResultStore

2021-12-06 Thread Yang Wang
+1 (non-binding)

Best,
Yang

Xintong Song  于2021年12月7日周二 上午9:38写道:

> +1 (binding)
>
> Thank you~
>
> Xintong Song
>
>
>
> On Mon, Dec 6, 2021 at 5:02 PM Till Rohrmann  wrote:
>
> > +1 (binding)
> >
> > Cheers,
> > Till
> >
> > On Mon, Dec 6, 2021 at 10:00 AM David Morávek  wrote:
> >
> > > Hi everyone,
> > >
> > > I'd like to open a vote on FLIP-194: Introduce the JobResultStore [1]
> > which
> > > has been
> > > discussed in this thread [2].
> > > The vote will be open for at least 72 hours unless there is an
> objection
> > or
> > > not enough votes.
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726435
> > > [2] https://lists.apache.org/thread/wlzv02jqtq221kb8dnm82v4xj8tomd94
> > >
> > > Best,
> > > D.
> > >
> >
>


Re: [VOTE] FLIP-194: Introduce the JobResultStore

2021-12-06 Thread Xintong Song
+1 (binding)

Thank you~

Xintong Song



On Mon, Dec 6, 2021 at 5:02 PM Till Rohrmann  wrote:

> +1 (binding)
>
> Cheers,
> Till
>
> On Mon, Dec 6, 2021 at 10:00 AM David Morávek  wrote:
>
> > Hi everyone,
> >
> > I'd like to open a vote on FLIP-194: Introduce the JobResultStore [1]
> which
> > has been
> > discussed in this thread [2].
> > The vote will be open for at least 72 hours unless there is an objection
> or
> > not enough votes.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726435
> > [2] https://lists.apache.org/thread/wlzv02jqtq221kb8dnm82v4xj8tomd94
> >
> > Best,
> > D.
> >
>


Re: [VOTE] Deprecate Java 8 support

2021-12-06 Thread Martijn Visser
+1 (non-binding)

Op ma 6 dec. 2021 om 19:58 schreef Ingo Bürk 

> Before more people let me know, let me update my vote to
>
> +1 (binding)
>
>
> (In all seriousness, thanks for the reminders!)
>
> On Mon, Dec 6, 2021, 16:54 Ingo Bürk  wrote:
>
> > +1 (non-binding)
> >
> >
> > Ingo
> >
> > On Mon, Dec 6, 2021 at 4:44 PM Chesnay Schepler 
> > wrote:
> >
> >> Hello,
> >>
> >> after recent discussions on the dev
> >>  and
> >> user 
> >> mailing list to deprecate Java 8 support, with a general consensus in
> >> favor of it, I would now like tod o a formal vote.
> >>
> >> The deprecation would entail a notification to our users to encourage
> >> migrating to Java 11, and various efforts on our side to prepare a
> >> migration to Java 11, like updating some e2e tests to actually run on
> >> Java 11, performance benchmarking etc. .
> >>
> >> There is no set date for the removal of Java 8 support.
> >>
> >> We'll use the usual minimum 72h vote duration, with committers having
> >> binding votes.
> >>
> >>
>
-- 

Martijn Visser | Product Manager

mart...@ververica.com




Follow us @VervericaData

--

Join Flink Forward  - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time


Re: [VOTE] Deprecate Java 8 support

2021-12-06 Thread Ingo Bürk
Before more people let me know, let me update my vote to

+1 (binding)


(In all seriousness, thanks for the reminders!)

On Mon, Dec 6, 2021, 16:54 Ingo Bürk  wrote:

> +1 (non-binding)
>
>
> Ingo
>
> On Mon, Dec 6, 2021 at 4:44 PM Chesnay Schepler 
> wrote:
>
>> Hello,
>>
>> after recent discussions on the dev
>>  and
>> user 
>> mailing list to deprecate Java 8 support, with a general consensus in
>> favor of it, I would now like tod o a formal vote.
>>
>> The deprecation would entail a notification to our users to encourage
>> migrating to Java 11, and various efforts on our side to prepare a
>> migration to Java 11, like updating some e2e tests to actually run on
>> Java 11, performance benchmarking etc. .
>>
>> There is no set date for the removal of Java 8 support.
>>
>> We'll use the usual minimum 72h vote duration, with committers having
>> binding votes.
>>
>>


[jira] [Created] (FLINK-25203) Implement duplicating for aliyun

2021-12-06 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-25203:


 Summary: Implement duplicating for aliyun
 Key: FLINK-25203
 URL: https://issues.apache.org/jira/browse/FLINK-25203
 Project: Flink
  Issue Type: Sub-task
  Components: FileSystems
Reporter: Dawid Wysakowicz
 Fix For: 1.15.0


We can use: https://www.alibabacloud.com/help/doc-detail/31979.htm



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25202) Implement duplicating for azure

2021-12-06 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-25202:


 Summary: Implement duplicating for azure
 Key: FLINK-25202
 URL: https://issues.apache.org/jira/browse/FLINK-25202
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / FileSystem
Reporter: Dawid Wysakowicz
 Fix For: 1.15.0


We can use: https://docs.microsoft.com/en-us/rest/api/storageservices/copy-blob



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25201) Implement duplicating for gcs

2021-12-06 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-25201:


 Summary: Implement duplicating for gcs
 Key: FLINK-25201
 URL: https://issues.apache.org/jira/browse/FLINK-25201
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / FileSystem
Reporter: Dawid Wysakowicz
 Fix For: 1.15.0


We can use https://cloud.google.com/storage/docs/json_api/v1/objects/rewrite



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25200) Implement duplicating for s3 filesystem

2021-12-06 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-25200:


 Summary: Implement duplicating for s3 filesystem
 Key: FLINK-25200
 URL: https://issues.apache.org/jira/browse/FLINK-25200
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / FileSystem
Reporter: Dawid Wysakowicz
 Fix For: 1.15.0


We can use https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Drop Zookeeper 3.4

2021-12-06 Thread Chesnay Schepler
Current users of ZK 3.4 and below would need to upgrade their Zookeeper 
installation that is used by Flink to 3.5+.


Whether K8s users are affected depends on whether they use ZK or not. If 
they do, see above, otherwise they are not affected at all.


On 06/12/2021 18:49, Arvid Heise wrote:

Could someone please help me understand the implications of the upgrade?

As far as I understood this upgrade would only affect users that have 
a zookeeper shared across multiple services, some of which require ZK 
3.4-? A workaround for those users would be to run two ZKs with 
different versions, eventually deprecating old ZK, correct?


If that is the only limitation, I'm +1 for the proposal since ZK 3.4 
is already EOL.


How are K8s users affected?

Best,

Arvid

On Mon, Dec 6, 2021 at 2:00 PM Chesnay Schepler  
wrote:


ping @users; any input on how this would affect you is highly
appreciated.

On 25/11/2021 22:39, Chesnay Schepler wrote:
> I included the user ML in the thread.
>
> @users Are you still using Zookeeper 3.4? If so, were you
planning to
> upgrade Zookeeper in the near future?
>
> I'm not sure about ZK compatibility, but we'd also upgrade
Curator to
> 5.x, which doesn't support ookeeperK 3.4 anymore.
>
> On 25/11/2021 21:56, Till Rohrmann wrote:
>> Should we ask on the user mailing list whether anybody is still
using
>> ZooKeeper 3.4 and thus needs support for this version or can a
ZooKeeper
>> 3.5/3.6 client talk to a ZooKeeper 3.4 cluster? I would expect
that
>> not a
>> lot of users depend on it but just to make sure that we aren't
>> annoying a
>> lot of our users with this change. Apart from that +1 for
removing it if
>> not a lot of user depend on it.
>>
>> Cheers,
>> Till
>>
>> On Wed, Nov 24, 2021 at 11:03 AM Matthias Pohl

>> wrote:
>>
>>> Thanks for starting this discussion, Chesnay. +1 from my side.
It's
>>> time to
>>> move forward with the ZK support considering the EOL of 3.4
you already
>>> mentioned. The benefits we gain from upgrading Curator to 5.x as a
>>> consequence is another plus point. Just for reference on the
>>> inconsistent
>>> state issue you mentioned: FLINK-24543 [1].
>>>
>>> Matthias
>>>
>>> [1] https://issues.apache.org/jira/browse/FLINK-24543
>>>
>>> On Wed, Nov 24, 2021 at 10:19 AM Chesnay Schepler

>>> wrote:
>>>
 Hello,

 I'd like to drop support for Zookeeper 3.4 in 1.15, upgrading the
 default to 3.5 with an opt-in for 3.6.

 Supporting Zookeeper 3.4 (which is already EOL) prevents us from
 upgrading Curator to 5.x, which would allow us to properly
fix an
 issue
 with inconsistent state. It is also required to eventually
support ZK
>>> 3.6.
>
>



Re: [DISCUSS] Drop Zookeeper 3.4

2021-12-06 Thread Arvid Heise
Could someone please help me understand the implications of the upgrade?

As far as I understood this upgrade would only affect users that have a
zookeeper shared across multiple services, some of which require ZK 3.4-? A
workaround for those users would be to run two ZKs with different versions,
eventually deprecating old ZK, correct?

If that is the only limitation, I'm +1 for the proposal since ZK 3.4 is
already EOL.

How are K8s users affected?

Best,

Arvid

On Mon, Dec 6, 2021 at 2:00 PM Chesnay Schepler  wrote:

> ping @users; any input on how this would affect you is highly appreciated.
>
> On 25/11/2021 22:39, Chesnay Schepler wrote:
> > I included the user ML in the thread.
> >
> > @users Are you still using Zookeeper 3.4? If so, were you planning to
> > upgrade Zookeeper in the near future?
> >
> > I'm not sure about ZK compatibility, but we'd also upgrade Curator to
> > 5.x, which doesn't support ookeeperK 3.4 anymore.
> >
> > On 25/11/2021 21:56, Till Rohrmann wrote:
> >> Should we ask on the user mailing list whether anybody is still using
> >> ZooKeeper 3.4 and thus needs support for this version or can a ZooKeeper
> >> 3.5/3.6 client talk to a ZooKeeper 3.4 cluster? I would expect that
> >> not a
> >> lot of users depend on it but just to make sure that we aren't
> >> annoying a
> >> lot of our users with this change. Apart from that +1 for removing it if
> >> not a lot of user depend on it.
> >>
> >> Cheers,
> >> Till
> >>
> >> On Wed, Nov 24, 2021 at 11:03 AM Matthias Pohl 
> >> wrote:
> >>
> >>> Thanks for starting this discussion, Chesnay. +1 from my side. It's
> >>> time to
> >>> move forward with the ZK support considering the EOL of 3.4 you already
> >>> mentioned. The benefits we gain from upgrading Curator to 5.x as a
> >>> consequence is another plus point. Just for reference on the
> >>> inconsistent
> >>> state issue you mentioned: FLINK-24543 [1].
> >>>
> >>> Matthias
> >>>
> >>> [1] https://issues.apache.org/jira/browse/FLINK-24543
> >>>
> >>> On Wed, Nov 24, 2021 at 10:19 AM Chesnay Schepler 
> >>> wrote:
> >>>
>  Hello,
> 
>  I'd like to drop support for Zookeeper 3.4 in 1.15, upgrading the
>  default to 3.5 with an opt-in for 3.6.
> 
>  Supporting Zookeeper 3.4 (which is already EOL) prevents us from
>  upgrading Curator to 5.x, which would allow us to properly fix an
>  issue
>  with inconsistent state. It is also required to eventually support ZK
> >>> 3.6.
> >
> >
>
>


Re: [VOTE] Deprecate Java 8 support

2021-12-06 Thread Arvid Heise
+1 (binding)

On Mon, Dec 6, 2021 at 5:43 PM Timo Walther  wrote:

> +1 (binding)
>
> Thanks,
> Timo
>
> On 06.12.21 17:28, David Morávek wrote:
> > +1 (non-binding)
> >
> > On Mon, Dec 6, 2021 at 4:55 PM Ingo Bürk  wrote:
> >
> >> +1 (non-binding)
> >>
> >>
> >> Ingo
> >>
> >> On Mon, Dec 6, 2021 at 4:44 PM Chesnay Schepler 
> >> wrote:
> >>
> >>> Hello,
> >>>
> >>> after recent discussions on the dev
> >>>  and
> >>> user  >
> >>> mailing list to deprecate Java 8 support, with a general consensus in
> >>> favor of it, I would now like tod o a formal vote.
> >>>
> >>> The deprecation would entail a notification to our users to encourage
> >>> migrating to Java 11, and various efforts on our side to prepare a
> >>> migration to Java 11, like updating some e2e tests to actually run on
> >>> Java 11, performance benchmarking etc. .
> >>>
> >>> There is no set date for the removal of Java 8 support.
> >>>
> >>> We'll use the usual minimum 72h vote duration, with committers having
> >>> binding votes.
> >>>
> >>>
> >>
> >
>
>


[jira] [Created] (FLINK-25199) fromValues does not emit final MAX watermark

2021-12-06 Thread Timo Walther (Jira)
Timo Walther created FLINK-25199:


 Summary: fromValues does not emit final MAX watermark
 Key: FLINK-25199
 URL: https://issues.apache.org/jira/browse/FLINK-25199
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Reporter: Timo Walther


It seems {{fromValues}} that generates multiple rows does not emit any 
watermarks:

{code}
StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);

Table inputTable =
tEnv.fromValues(
DataTypes.ROW(
DataTypes.FIELD("weight", DataTypes.DOUBLE()),
DataTypes.FIELD("f0", DataTypes.STRING()),
DataTypes.FIELD("f1", DataTypes.DOUBLE()),
DataTypes.FIELD("f2", DataTypes.DOUBLE()),
DataTypes.FIELD("f3", DataTypes.DOUBLE()),
DataTypes.FIELD("f4", DataTypes.INT()),
DataTypes.FIELD("label", DataTypes.STRING())),
Row.of(1., "a", 1., 1., 1., 2, "l1"),
Row.of(1., "a", 1., 1., 1., 2, "l1"));

DataStream input = tEnv.toDataStream(inputTable);
{code}

{{fromValues(1, 2, 3)}} or {{fromValues}} with only 1 row works correctly.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [VOTE] Deprecate Java 8 support

2021-12-06 Thread Timo Walther

+1 (binding)

Thanks,
Timo

On 06.12.21 17:28, David Morávek wrote:

+1 (non-binding)

On Mon, Dec 6, 2021 at 4:55 PM Ingo Bürk  wrote:


+1 (non-binding)


Ingo

On Mon, Dec 6, 2021 at 4:44 PM Chesnay Schepler 
wrote:


Hello,

after recent discussions on the dev
 and
user 
mailing list to deprecate Java 8 support, with a general consensus in
favor of it, I would now like tod o a formal vote.

The deprecation would entail a notification to our users to encourage
migrating to Java 11, and various efforts on our side to prepare a
migration to Java 11, like updating some e2e tests to actually run on
Java 11, performance benchmarking etc. .

There is no set date for the removal of Java 8 support.

We'll use the usual minimum 72h vote duration, with committers having
binding votes.










[jira] [Created] (FLINK-25198) add document about how to debug with the name and description

2021-12-06 Thread Wenlong Lyu (Jira)
Wenlong Lyu created FLINK-25198:
---

 Summary: add document about how to debug with the name and 
description 
 Key: FLINK-25198
 URL: https://issues.apache.org/jira/browse/FLINK-25198
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wenlong Lyu
 Fix For: 1.15.0


the doc could in the debugging section 
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/ops/debugging/debugging_event_time/
 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [VOTE] Deprecate Java 8 support

2021-12-06 Thread David Morávek
+1 (non-binding)

On Mon, Dec 6, 2021 at 4:55 PM Ingo Bürk  wrote:

> +1 (non-binding)
>
>
> Ingo
>
> On Mon, Dec 6, 2021 at 4:44 PM Chesnay Schepler 
> wrote:
>
> > Hello,
> >
> > after recent discussions on the dev
> >  and
> > user 
> > mailing list to deprecate Java 8 support, with a general consensus in
> > favor of it, I would now like tod o a formal vote.
> >
> > The deprecation would entail a notification to our users to encourage
> > migrating to Java 11, and various efforts on our side to prepare a
> > migration to Java 11, like updating some e2e tests to actually run on
> > Java 11, performance benchmarking etc. .
> >
> > There is no set date for the removal of Java 8 support.
> >
> > We'll use the usual minimum 72h vote duration, with committers having
> > binding votes.
> >
> >
>


Re: [VOTE] Deprecate Java 8 support

2021-12-06 Thread Ingo Bürk
+1 (non-binding)


Ingo

On Mon, Dec 6, 2021 at 4:44 PM Chesnay Schepler  wrote:

> Hello,
>
> after recent discussions on the dev
>  and
> user 
> mailing list to deprecate Java 8 support, with a general consensus in
> favor of it, I would now like tod o a formal vote.
>
> The deprecation would entail a notification to our users to encourage
> migrating to Java 11, and various efforts on our side to prepare a
> migration to Java 11, like updating some e2e tests to actually run on
> Java 11, performance benchmarking etc. .
>
> There is no set date for the removal of Java 8 support.
>
> We'll use the usual minimum 72h vote duration, with committers having
> binding votes.
>
>


Re: [VOTE] Deprecate Java 8 support

2021-12-06 Thread Konstantin Knauf
+1

On Mon, Dec 6, 2021 at 4:44 PM Chesnay Schepler  wrote:

> Hello,
>
> after recent discussions on the dev
>  and
> user 
> mailing list to deprecate Java 8 support, with a general consensus in
> favor of it, I would now like tod o a formal vote.
>
> The deprecation would entail a notification to our users to encourage
> migrating to Java 11, and various efforts on our side to prepare a
> migration to Java 11, like updating some e2e tests to actually run on
> Java 11, performance benchmarking etc. .
>
> There is no set date for the removal of Java 8 support.
>
> We'll use the usual minimum 72h vote duration, with committers having
> binding votes.
>
>

-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk


[VOTE] Deprecate Java 8 support

2021-12-06 Thread Chesnay Schepler

Hello,

after recent discussions on the dev 
 and 
user  
mailing list to deprecate Java 8 support, with a general consensus in 
favor of it, I would now like tod o a formal vote.


The deprecation would entail a notification to our users to encourage 
migrating to Java 11, and various efforts on our side to prepare a 
migration to Java 11, like updating some e2e tests to actually run on 
Java 11, performance benchmarking etc. .


There is no set date for the removal of Java 8 support.

We'll use the usual minimum 72h vote duration, with committers having 
binding votes.




[jira] [Created] (FLINK-25197) Using Statefun RequestReplyFunctionBuilder fails with Java 8 date/time type `java.time.Duration` not supported by default: add Module "org.apache.flink.shaded.jackson2.c

2021-12-06 Thread Galen Warren (Jira)
Galen Warren created FLINK-25197:


 Summary: Using Statefun RequestReplyFunctionBuilder fails with 
Java 8 date/time type `java.time.Duration` not supported by default: add Module 
"org.apache.flink.shaded.jackson2.com.fasterxml.jackson.datatype:jackson-datatype-jsr310"
 to enable handling 
 Key: FLINK-25197
 URL: https://issues.apache.org/jira/browse/FLINK-25197
 Project: Flink
  Issue Type: Bug
  Components: Stateful Functions
Affects Versions: statefun-3.1.0
Reporter: Galen Warren
 Fix For: statefun-3.1.0


When using RequestReplyFunctionBuilder to build a stateful functions job, the 
job fails at runtime with:

Java 8 date/time type `java.time.Duration` not supported by default: add Module 
"org.apache.flink.shaded.jackson2.com.fasterxml.jackson.datatype:jackson-datatype-jsr310"
 to enable handling 

It appears this is because, in 
[RequestReplyFunctionBuilder::transportClientPropertiesAsObjectNode|https://github.com/apache/flink-statefun/blob/b4ba9547b8f0105a28544fd28a5e0433666e9023/statefun-flink/statefun-flink-datastream/src/main/java/org/apache/flink/statefun/flink/datastream/RequestReplyFunctionBuilder.java#L127],
 a default instance of ObjectMapper is used to serialize the client properties, 
which now include a java.time.Duration. There is a 
[StateFunObjectMapper|https://github.com/apache/flink-statefun/blob/master/statefun-flink/statefun-flink-common/src/main/java/org/apache/flink/statefun/flink/common/json/StateFunObjectMapper.java]
 class in the project that has customized serde support, but it is not used 
here.

The fix seems to be to:
 * Use an instance of StateFunObjectMapper to serialize the client properties 
in RequestReplyFunctionBuilder
 * Modify StateFunObjecdtMapper to both serialize and deserialize instances of 
java.time.Duration (currently, only deserialization is supported)

I've made these changes locally and it seems to fix the problem. Would you be 
interested in a PR? Thanks.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25196) Add Documentation for Data Sink

2021-12-06 Thread ZhuoYu Chen (Jira)
ZhuoYu Chen created FLINK-25196:
---

 Summary: Add Documentation for Data Sink
 Key: FLINK-25196
 URL: https://issues.apache.org/jira/browse/FLINK-25196
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream, Documentation
Affects Versions: 1.15.0
Reporter: ZhuoYu Chen






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25195) Use duplicating API for shared artefacts in RocksDB snapshots

2021-12-06 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-25195:


 Summary: Use duplicating API for shared artefacts in RocksDB 
snapshots
 Key: FLINK-25195
 URL: https://issues.apache.org/jira/browse/FLINK-25195
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Checkpointing, Runtime / State Backends
Reporter: Dawid Wysakowicz
 Fix For: 1.15.0


Instead of uploading all artefacts, we could use the duplicating API to cheaply 
create an independent copy of shared artefacts instead of uploading them again 
(as described in FLINK-25192)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25194) Implement an API for duplicating artefacts

2021-12-06 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-25194:


 Summary: Implement an API for duplicating artefacts
 Key: FLINK-25194
 URL: https://issues.apache.org/jira/browse/FLINK-25194
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / FileSystem, Runtime / Checkpointing
Reporter: Dawid Wysakowicz
 Fix For: 1.15.0


We should implement methods that let us duplicate artefacts in a DFS. We can 
later on use it for cheaply duplicating shared snapshots artefacts instead of 
reuploading them.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs

2021-12-06 Thread Timo Walther

Hi Godfrey,

> design makes thing more complex.

Yes, the design might be a bit more complex. But operator migration is 
way easier than ExecNode migration at a later point in time for code 
maintenance. We know that ExecNodes can become pretty complex. Even 
though we have put a lot of code into `CommonXXExecNode` it will be a 
lot of work to maintain multiple versions of ExecNodes. If we can avoid 
this with operator state migration, this should always be preferred over 
a new ExecNode version.


I'm aware that operator state migration might only be important for 
roughly 10 % of all changes. A new ExecNode version will be used for 90% 
of all changes.


> If there are multiple state layouts, which layout the ExecNode should 
use?


It is not the responsibility of the ExecNode to decide this but the 
operator. Something like:


class X extends ProcessFunction {
  ValueState oldStateLayout;
  ValueState newStateLayout;

  open() {
if (oldStateLayout.get() != null) {
  performOperatorMigration();
}
useNewStateLayout();
  }
}

Operator migration is meant for smaller "more local" changes without 
touching the ExecNode layer. The CEP library and DataStream API sources 
are performing operator migration for years already.



> `supportedPlanChanges ` and `supportedSavepointChanges ` are a bit 
obscure.


Let me try to come up with more examples why I think both annotation 
make sense and are esp. important *for test coverage*.


supportedPlanChanges:

Let's assume we have some JSON in Flink 1.15:

{
  some-prop: 42
}

And we want to extend the JSON in Flink 1.16:

{
  some-prop: 42,
  some-flag: false
}

Maybe we don't need to increase the ExecNode version but only ensure 
that the flag is set to `false` by default for the older versions.


We need a location to track changes and document the changelog. With the 
help of the annotation supportedPlanChanges = [1.15, 1.16] we can verify 
that we have tests for both JSON formats.


And once we decide to drop the 1.15 format, we enforce plan migration 
and fill-in the default value `false` into the old plans and bump their 
JSON plan version to 1.16 or higher.




> once the state layout is changed, the ExecNode version needs also be 
updated


This will still be the majority of cases. But if we can avoid this, we 
should do it for not having too much duplicate code to maintain.




Thanks,
Timo


On 06.12.21 09:58, godfrey he wrote:

Hi, Timo,

Thanks for the detailed explanation.


We change an operator state of B in Flink 1.16. We perform the change in the 
operator of B in a way to support both state layouts. Thus, no need for a new 
ExecNode version.


I think this design makes thing more complex.
1. If there are multiple state layouts, which layout the ExecNode should use ?
It increases the cost of understanding for developers (especially for
Flink newer),
making them prone to mistakes.
2. `supportedPlanChanges ` and `supportedSavepointChanges ` are a bit obscure.

The purpose of ExecNode annotations are not only to support powerful validation,
but more importantly to make it easy for developers to understand
to ensure that every modification is easy and state compatible.

I prefer, once the state layout is changed, the ExecNode version needs
also be updated.
which could make thing simple. How about
rename `supportedPlanChanges ` to `planCompatibleVersion`
(which means the plan is compatible with the plan generated by the
given version node)
  and rename `supportedSavepointChanges` to `savepointCompatibleVersion `
(which means the state is compatible with the state generated by the
given version node) ?
The names also indicate that only one version value can be set.

WDYT?

Best,
Godfrey









Timo Walther  于2021年12月2日周四 下午11:42写道:


Response to Marios's feedback:

  > there should be some good logging in place when the upgrade is taking
place

Yes, I agree. I added this part to the FLIP.

  > config option instead that doesn't provide the flexibility to
overwrite certain plans

One can set the config option also around sections of the
multi-statement SQL script.

SET 'table.plan.force-recompile'='true';

COMPILE ...

SET 'table.plan.force-recompile'='false';

But the question is why a user wants to run COMPILE multiple times. If
it is during development, then running EXECUTE (or just the statement
itself) without calling COMPILE should be sufficient. The file can also
manually be deleted if necessary.

What do you think?

Regards,
Timo



On 02.12.21 16:09, Timo Walther wrote:

Hi Till,

Yes, you might have to. But not a new plan from the SQL query but a
migration from the old plan to the new plan. This will not happen often.
But we need a way to evolve the format of the JSON plan itself.

Maybe this confuses a bit, so let me clarify it again: Mostly ExecNode
versions and operator state layouts will evolve. Not the plan files,
those will be pretty stable. But also not infinitely.

Regards,
Timo


On 02.12.21 16:01, Till Rohrmann wrote:


[jira] [Created] (FLINK-25193) Document claim & no-claim mode

2021-12-06 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-25193:


 Summary: Document claim & no-claim mode
 Key: FLINK-25193
 URL: https://issues.apache.org/jira/browse/FLINK-25193
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Runtime / Checkpointing
Reporter: Dawid Wysakowicz
 Fix For: 1.15.0


We should describe how the different restore modes work. It is important to go 
through the FLIP and include all {{NOTES}} in the written documentation



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25192) Implement proper no-claim mode support

2021-12-06 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-25192:


 Summary: Implement proper no-claim mode support
 Key: FLINK-25192
 URL: https://issues.apache.org/jira/browse/FLINK-25192
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Checkpointing
Reporter: Dawid Wysakowicz
 Fix For: 1.15.0


In the no-claim mode should not depend on any artefacts of the initial snapshot 
after the restore. In order to do that we should pass a flag along with the RPC 
and later on with a CheckpointBarrier to notify TaskManagers about that 
intention. Moreover state backends should take the flag into consideration and 
take "full snapshots"
* RocksDB state backend should upload all files instead of reusing artefacts 
from the initial one
* Changelog state backend should materialize the changelog upon the flag

https://cwiki.apache.org/confluence/x/bIyqCw#FLIP193:Snapshotsownership-No-claimmode(defaultmode)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Drop Zookeeper 3.4

2021-12-06 Thread Chesnay Schepler

ping @users; any input on how this would affect you is highly appreciated.

On 25/11/2021 22:39, Chesnay Schepler wrote:

I included the user ML in the thread.

@users Are you still using Zookeeper 3.4? If so, were you planning to 
upgrade Zookeeper in the near future?


I'm not sure about ZK compatibility, but we'd also upgrade Curator to 
5.x, which doesn't support ookeeperK 3.4 anymore.


On 25/11/2021 21:56, Till Rohrmann wrote:

Should we ask on the user mailing list whether anybody is still using
ZooKeeper 3.4 and thus needs support for this version or can a ZooKeeper
3.5/3.6 client talk to a ZooKeeper 3.4 cluster? I would expect that 
not a
lot of users depend on it but just to make sure that we aren't 
annoying a

lot of our users with this change. Apart from that +1 for removing it if
not a lot of user depend on it.

Cheers,
Till

On Wed, Nov 24, 2021 at 11:03 AM Matthias Pohl 
wrote:

Thanks for starting this discussion, Chesnay. +1 from my side. It's 
time to

move forward with the ZK support considering the EOL of 3.4 you already
mentioned. The benefits we gain from upgrading Curator to 5.x as a
consequence is another plus point. Just for reference on the 
inconsistent

state issue you mentioned: FLINK-24543 [1].

Matthias

[1] https://issues.apache.org/jira/browse/FLINK-24543

On Wed, Nov 24, 2021 at 10:19 AM Chesnay Schepler 
wrote:


Hello,

I'd like to drop support for Zookeeper 3.4 in 1.15, upgrading the
default to 3.5 with an opt-in for 3.6.

Supporting Zookeeper 3.4 (which is already EOL) prevents us from
upgrading Curator to 5.x, which would allow us to properly fix an 
issue

with inconsistent state. It is also required to eventually support ZK

3.6.







Re: [DISCUSS] Deprecate Java 8 support

2021-12-06 Thread Piotr Nowojski
> @Piotr I guess that could be investigated along with an update of
flink-benchmarks to run on Java 11 as well...

I would second Nico's idea to run our benchmarks on Java 11 to see what has
changed compared to Java 8.

Best,
Piotrek

śr., 1 gru 2021 o 11:23 wenlong.lwl  napisał(a):

> Thanks for the explanation.
>
> +1 for putting a bigger emphasis on performance on Java 11+. It would be
> great if Flink is already well prepared when users want to do the upgrade.
>
>
> Best,
> Wenlong
>
> On Wed, 1 Dec 2021 at 16:54, Chesnay Schepler  wrote:
>
> > That is correct, yes. During the deprecation period we won't seen any
> > improvements on our end.
> >
> > Something that I could envision is that the docker images will default to
> > Java 11 after a while, and that we put a bigger emphasis on performance
> on
> > Java 11+ than on Java 8.
> >
> > On 01/12/2021 02:31, wenlong.lwl wrote:
> >
> > hi, @Chesnay Schepler  would you explain more about
> > what would happen when deprecating Java 8, but still support it. IMO, if
> we
> > still generate packages based on Java 8 which seems to be a  consensus,
> > we still can not take the advantages you mentioned even if we announce
> that
> > Java 8 support is deprecated.
> >
> >
> > Best,
> > Wenlong
> >
> > On Mon, 29 Nov 2021 at 17:22, Marios Trivyzas  wrote:
> >
> >> +1 from me as well on the Java 8 deprecation!
> >> It's important to make the users aware, and "force" them but also the
> >> communities of other
> >> related projects (like the aforementioned Hive) to start preparing for
> the
> >> future drop of Java 8
> >> support and the upgrade to the recent stable versions.
> >>
> >>
> >> On Sun, Nov 28, 2021 at 11:15 PM Thomas Weise  wrote:
> >>
> >> > +1 for Java 8 deprecation. It's an important signal for users and we
> >> > need to give sufficient time to adopt. Thanks Chesnay for starting the
> >> > discussion! Maybe user@ can be included into this discussion?
> >> >
> >> > Thomas
> >> >
> >> >
> >> > On Fri, Nov 26, 2021 at 6:49 AM Becket Qin 
> >> wrote:
> >> > >
> >> > > Thanks for raising the discussion, Chesnay.
> >> > >
> >> > > I think it is OK to deprecate Java 8 to let the users know that Java
> >> 11
> >> > > migration should be put into the schedule. However, According to
> some
> >> of
> >> > > the statistics of the Java version adoption[1], a large number of
> >> users
> >> > are
> >> > > still using Java 8 in production. I doubt that Java 8 users will
> drop
> >> to
> >> > a
> >> > > negligible amount within the next 2 - 3 Flink releases. I would
> >> suggest
> >> > > making the time to drop Java 8 support flexible.
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Jiangjie (Becket) Qin
> >> > >
> >> > > [1] https://www.infoq.com/news/2021/07/snyk-jvm-2021/
> >> > >
> >> > > On Fri, Nov 26, 2021 at 5:09 AM Till Rohrmann  >
> >> > wrote:
> >> > >
> >> > > > +1 for the deprecation and reaching out to the user ML to ask for
> >> > feedback
> >> > > > from our users. Thanks for driving this Chesnay!
> >> > > >
> >> > > > Cheers,
> >> > > > Till
> >> > > >
> >> > > > On Thu, Nov 25, 2021 at 10:15 AM Roman Khachatryan <
> >> ro...@apache.org>
> >> > > > wrote:
> >> > > >
> >> > > > > The situation is probably a bit different now compared to the
> >> > previous
> >> > > > > upgrade: some users might be using Amazon Coretto (or other
> >> builds)
> >> > > > > which have longer support.
> >> > > > >
> >> > > > > Still +1 for deprecation to trigger migration, and thanks for
> >> > bringing
> >> > > > > this up!
> >> > > > >
> >> > > > > Regards,
> >> > > > > Roman
> >> > > > >
> >> > > > > On Thu, Nov 25, 2021 at 10:09 AM Arvid Heise 
> >> > wrote:
> >> > > > > >
> >> > > > > > +1 to deprecate Java 8, so we can hopefully incorporate the
> >> module
> >> > > > > concept
> >> > > > > > in Flink.
> >> > > > > >
> >> > > > > > On Thu, Nov 25, 2021 at 9:49 AM Chesnay Schepler <
> >> > ches...@apache.org>
> >> > > > > wrote:
> >> > > > > >
> >> > > > > > > Users can already use APIs from Java 8/11.
> >> > > > > > >
> >> > > > > > > On 25/11/2021 09:35, Francesco Guardiani wrote:
> >> > > > > > > > +1 with what both Ingo and Matthias sad, personally, I
> >> cannot
> >> > wait
> >> > > > to
> >> > > > > > > start using some of
> >> > > > > > > > the APIs introduced in Java 9. And I'm pretty sure that's
> >> the
> >> > same
> >> > > > > for
> >> > > > > > > our users as well.
> >> > > > > > > >
> >> > > > > > > > On Tuesday, 23 November 2021 13:35:07 CET Ingo Bürk wrote:
> >> > > > > > > >> Hi everyone,
> >> > > > > > > >>
> >> > > > > > > >> continued support for Java 8 can also create project
> risks,
> >> > e.g.
> >> > > > if
> >> > > > > a
> >> > > > > > > >> vulnerability arises in Flink's dependencies and we
> cannot
> >> > upgrade
> >> > > > > them
> >> > > > > > > >> because they no longer support Java 8. Some projects
> >> already
> >> > > > started
> >> > > > > > > >> deprecating support as well, like Kafka, and other
> projects
> >> > will
> >> > > > > 

[jira] [Created] (FLINK-25191) Skip savepoints for recovery

2021-12-06 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-25191:


 Summary: Skip savepoints for recovery
 Key: FLINK-25191
 URL: https://issues.apache.org/jira/browse/FLINK-25191
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Checkpointing
Reporter: Dawid Wysakowicz
 Fix For: 1.15.0


Intermediate savepoints should not be used for recovery. In order to achieve 
that we should:
* do not send {{notifyCheckpointComplete}} for intermediate savepoints
* do not add them to {{CompletedCheckpointStore}}

Important! Synchronous savepoints (stop-with-savepoint) should still commit 
side-effects. We need to distinguish them from the intermediate savepoints.

https://cwiki.apache.org/confluence/x/bIyqCw#FLIP193:Snapshotsownership-SkippingSavepointsforRecovery



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [VOTE] FLIP-193: Snapshots ownership

2021-12-06 Thread Dawid Wysakowicz
Dear devs,

Thank you all for the participation in all of the comments in the
discussion!

I'm happy to announce that we have unanimously approved this FLIP.

There are 7 approving votes of which 7 are binding:

  * Konstantin Knauf (binding)
  * Till Rohrmann (binding)
  * Yun Gao (binding)
  * Roman Khachatryan (binding)
  * Yun Tang (binding)
  * Yu Li (binding)
  * Dawid Wysakowicz (binding)

Best,

Dawid

On 01/12/2021 14:03, Konstantin Knauf wrote:
> Thanks, Dawid.
>
> +1
>
> On Wed, Dec 1, 2021 at 1:23 PM Dawid Wysakowicz 
> wrote:
>
>> Dear devs,
>>
>> I'd like to open a vote on FLIP-193: Snapshots ownership [1] which was
>> discussed in this thread [2].
>> The vote will be open for at least 72 hours unless there is an objection or
>> not enough votes.
>>
>> Best,
>>
>> Dawid
>>
>> [1] https://cwiki.apache.org/confluence/x/bIyqCw
>>
>> [2] https://lists.apache.org/thread/zw2crf0c7t7t4cb5cwcwjpvsb3r1ovz2
>>
>>
>>


OpenPGP_signature
Description: OpenPGP digital signature


[DISCUSS] Strong read-after-write consistency of Flink FileSystems

2021-12-06 Thread David Morávek
Hi Everyone,

as outlined in FLIP-194 discussion [1], for the future directions of Flink
HA services, I'd like to verify my thoughts around guarantees of the
distributed filesystems used with Flink.

Currently some of the services (*JobGraphStore*, *CompletedCheckpointStore*)
are implemented using a combination of strongly consistent Metadata storage
(ZooKeeper, K8s CM) and the actual FileSystem. Reasoning behind this dates
back to days, when S3 was an eventually consistent FileSystem and we needed
a strongly consistent view of the data.

I did some research, and my feeling is that all the major FileSystems that
Flink supports already provide strong read-after-write consistency, which
would be sufficient to decrease a complexity of the current HA
implementations.

FileSystems that I've checked and that seem to support strong
read-after-write consistency:
- S3
- GCS
- Azure Blob Storage
- Aliyun OSS
- HDFS
- Minio

Are you aware of other FileSystems that are used with Flink? Do they
support the consistency that is required for starting a new initiatives
towards simpler / less error-prone HA services? Are you aware of any
problems with the above mentioned FileSystems that I might have missed?

I'm also bringing this up to user@f.a.o, to make sure we don't miss any
FileSystems.

[1] https://lists.apache.org/thread/wlzv02jqtq221kb8dnm82v4xj8tomd94

Best,
D.


[jira] [Created] (FLINK-25190) The number of TaskManagers whose status is Pending should be reported

2021-12-06 Thread john (Jira)
john created FLINK-25190:


 Summary: The number of TaskManagers whose status is Pending should 
be reported
 Key: FLINK-25190
 URL: https://issues.apache.org/jira/browse/FLINK-25190
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Kubernetes, Deployment / YARN
Reporter: john


The number of TaskManagers whose status is Pending should be reported to allow 
the outside world to perceive the lack of resources.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] FLIP-194: Introduce the JobResultStore

2021-12-06 Thread David Morávek
>
> I also hope that we could remove all the pointers in the HA store(ZK,
> ConfigMap) in the future.


I'll open a new thread with {user,dev}@f.a.o to verify the thoughts around
strong-read-after consistency for FileSystems. If that goes well I can see
it as one of the possible topics for 1.16 ;)

On Mon, Dec 6, 2021 at 11:03 AM Yang Wang  wrote:

> Thanks for the fruitful discussion. I also hope that we could remove all
> the pointers in the HA store(ZK, ConfigMap) in the future.
> After then, we only rely on the ZK/ConfigMap for leader election/retrieval.
>
>
> Best,
> Yang
>
> David Morávek  于2021年12月6日周一 下午4:57写道:
>
> > as all of the concerns seems to be addressed, I'd like to proceed with
> the
> > vote to move things forward.
> >
> > Thanks everyone for the feedback, it was really helpful!
> >
> > Best,
> > D.
> >
> > On Wed, Dec 1, 2021 at 6:39 AM Zhu Zhu  wrote:
> >
> > > Thanks for the explanation Matthias. The solution sounds good to me.
> > > I have no more concerns and +1 for the FLIP.
> > >
> > > Thanks,
> > > Zhu
> > >
> > > Xintong Song  于2021年12月1日周三 下午12:56写道:
> > >
> > > > @David,
> > > >
> > > > Thanks for the clarification.
> > > >
> > > > No more concerns from my side. +1 for this FLIP.
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Wed, Dec 1, 2021 at 12:28 AM Till Rohrmann 
> > > > wrote:
> > > >
> > > > > Given the other breaking changes, I think that it is ok to remove
> the
> > > > > `RunningJobsRegistry` completely.
> > > > >
> > > > > Since we allow users to specify a HighAvailabilityServices
> > > implementation
> > > > > when starting Flink via `high-availability: FQDN`, I think we
> should
> > > mark
> > > > > the interface at least @Experimental.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Tue, Nov 30, 2021 at 2:29 PM Mika Naylor 
> > wrote:
> > > > >
> > > > > > Hi Till,
> > > > > >
> > > > > > We thought that breaking interfaces, specifically
> > > > > > HighAvailabilityServices and RunningJobsRegistry, was acceptable
> in
> > > > this
> > > > > > instance because:
> > > > > >
> > > > > > - Neither of these interfaces are marked @Public and so carry no
> > > > > >guarantees about being public and stable.
> > > > > > - As far as we are aware, we currently have no users with custom
> > > > > >HighAvailabilityServices implementations.
> > > > > > - The interface was already broken in 1.14 with the changes to
> > > > > >CheckpointRecoveryFactory, and will likely be changed again in
> > > 1.15
> > > > > >due to further changes in that factory.
> > > > > >
> > > > > > Given that, we thought changes to the interface would not be
> > > > disruptive.
> > > > > > Perhaps it could be annotated as @Internal - I'm not sure exactly
> > > what
> > > > > > guarantees we try and give for the stability of the
> > > > > > HighAvailabilityServices interface.
> > > > > >
> > > > > > Kind regards,
> > > > > > Mika
> > > > > >
> > > > > > On 26.11.2021 18:28, Till Rohrmann wrote:
> > > > > > >Thanks for creating this FLIP Matthias, Mika and David.
> > > > > > >
> > > > > > >I think the JobResultStore is an important piece for fixing
> > Flink's
> > > > last
> > > > > > >high-availability problem (afaik). Once we have this piece in
> > place,
> > > > > users
> > > > > > >no longer risk to re-execute a successfully completed job.
> > > > > > >
> > > > > > >I have one comment concerning breaking interfaces:
> > > > > > >
> > > > > > >If we don't want to break interfaces, then we could keep the
> > > > > > >HighAvailabilityServices.getRunningJobsRegistry() method and
> add a
> > > > > default
> > > > > > >implementation for HighAvailabilityServices.getJobResultStore().
> > We
> > > > > could
> > > > > > >then deprecate the former method and then remove it in the
> > > subsequent
> > > > > > >release (1.16).
> > > > > > >
> > > > > > >Apart from that, +1 for the FLIP.
> > > > > > >
> > > > > > >Cheers,
> > > > > > >Till
> > > > > > >
> > > > > > >On Wed, Nov 17, 2021 at 6:05 PM David Morávek 
> > > > wrote:
> > > > > > >
> > > > > > >> Hi everyone,
> > > > > > >>
> > > > > > >> Matthias, Mika and I want to start a discussion about
> > introduction
> > > > of
> > > > > a
> > > > > > new
> > > > > > >> Flink component, the *JobResultStore*.
> > > > > > >>
> > > > > > >> The main motivation is to address shortcomings of the
> > > > > > *RunningJobsRegistry*
> > > > > > >> and surpass it with the new component. These shortcomings have
> > > been
> > > > > > first
> > > > > > >> described in FLINK-11813 [1].
> > > > > > >>
> > > > > > >> This change should improve the overall stability of the
> > > JobManager's
> > > > > > >> components and address the race conditions in some of the fail
> > > over
> > > > > > >> scenarios during the job cleanup lifecycle.
> > > > > > >>
> > > > > > >> It should also help to ensure that Flink doesn't leave any
> > > uncleaned
> > > > > > >> resources behind.
> > > > > > >>
> > > > > > >> 

Re: [DISCUSS] FLIP-197: API stability graduation process

2021-12-06 Thread Timo Walther

Hi Till,

thanks for starting this discussion. I think this topic should have been 
discussed way earlier. I have two questions:


1) It might be an implementation detail but where do you expect this 
`FlinkVersion` to be located? This is actually a quite important class 
that also needs to be made available for other use cases. For the SQL 
upgrade story we will definitely need a similar enum. And I think we 
have something similar for other tests (see `MigrationVersion`). For 
reducing releasing overhead, it would be good to unify all these 
"version metadata".


2) Deprecations: Shall we also start versioning deprecation decisions? 
Esp. for Experimental/PublicEvolving interfaces we should remove 
deprecated code in time. We should also let users know when we are 
planning to remove code. E.g. clearly indicate that the deprecation will 
happen in the next major version?


Regards,
Timo

On 06.12.21 10:04, Till Rohrmann wrote:

Ok, then lets increase the graduation period to 2 releases. If we see that
this is super easy for us to do, then we can shorten it in the future.

Cheers,
Till

On Mon, Dec 6, 2021 at 9:54 AM Chesnay Schepler  wrote:


Given that you can delay the graduation if there is a good reason for
it, we should be able to cover that case even if the graduation would
happen by default after 1 month.

That said, personally I would also be in favor of 2 releases; we see
plenty of users not upgrading to every single Flink version, and this
may  give us a bit more coverage.

On 06/12/2021 09:20, Ingo Bürk wrote:

Hi Till,

from my (admittedly limited) experience with how far projects lag behind

in

terms of Flink versions – yes, the combined time it would take to mature
then seems reasonable enough for a sufficient adoption, IMO.

Another reason why I think two releases as a default for the last step
makes sense: say you mature an API to PublicEvolving. Typically, there

will

be issues found afterwards. Even if you address these in the very next
release cycle, a duration of one release would mean you fully mature the
API in the same release in which things are still being fixed;

intuitively,

it makes sense to me that the step to Public would come after a period of
no changes needed, however.


Ingo

On Fri, Dec 3, 2021 at 4:55 PM Till Rohrmann 

wrote:



Hi Ingo, thanks for your feedback.

Do you think that two release cycles per graduation step would be long
enough or should it be longer?

Cheers,
Till

On Fri, Dec 3, 2021 at 4:29 PM Ingo Bürk  wrote:


Hi Till,

Overall I whole-heartedly agree with the proposals in this FLIP. Thank

you

for starting this discussion as well! This seems like something that

could

be tested quite nicely with ArchUnit as well; I'll be happy to help

should

the FLIP be accepted.


I would propose per default a single release.

The step from PublicEvolving to Public feels more important to me, and

I

would personally suggest making this transition a bit longer. We have a

bit

of a chicken-egg problem here, because the goal of your FLIP is,
ultimately, also to motivate faster adoption of new Flink versions, but

the

status quo prevents that; if we mature APIs too quickly, we risk losing

out

on important feedback. Therefore, I would propose starting slower here,

and

rather think about shortening that cycle in the future.


Best
Ingo

On Thu, Dec 2, 2021 at 3:57 PM Till Rohrmann 

wrote:

Hi everyone,

As promised, here is the follow-up FLIP [1] for discussing how we can
ensure that newly introduced APIs are being stabilized over time. This

FLIP

is related to FLIP-196 [2].

The idea of FLIP-197 is to introduce an API graduation process that

forces

us to increase the API stability guarantee unless there is a very good
reason not to do so. So the proposal is to reverse the process from

opt-in

(increasing the stability guarantee explicitly) to opt-out (deciding

that

an API cannot be graduated with a good reason).

Since every process breaks if it is not automated, we propose a richer

set

of API stability annotations that can capture enough information so

that

we

can implement a test that fails if we fail to follow the process.

Looking forward to your feedback.

Hopefully, we can provide our users a better experience when working

with

Flink because we offer more stable APIs and make them available

faster.


[1] https://cwiki.apache.org/confluence/x/J5eqCw
[2] https://cwiki.apache.org/confluence/x/IJeqCw

Cheers,
Till










Re: [ANNOUNCE] New Apache Flink Committer - Ingo Bürk

2021-12-06 Thread Roman Khachatryan
Congratulations, Ingo!

Regards,
Roman


On Mon, Dec 6, 2021 at 11:05 AM Yang Wang  wrote:
>
> Congratulations, Ingo!
>
> Best,
> Yang
>
> Sergey Nuyanzin  于2021年12月6日周一 下午3:35写道:
>
> > Congratulations, Ingo!
> >
> > On Mon, Dec 6, 2021 at 7:32 AM Leonard Xu  wrote:
> >
> > > Congratulations, Ingo! Well Deserved.
> > >
> > > Best,
> > > Leonard
> > >
> > > > 2021年12月3日 下午11:24,Ingo Bürk  写道:
> > > >
> > > > Thank you everyone for the warm welcome!
> > > >
> > > >
> > > > Best
> > > > Ingo
> > > >
> > > > On Fri, Dec 3, 2021 at 11:47 AM Ryan Skraba
> >  > > >
> > > > wrote:
> > > >
> > > >> Congratulations Ingo!
> > > >>
> > > >> On Fri, Dec 3, 2021 at 8:17 AM Yun Tang  wrote:
> > > >>
> > > >>> Congratulations, Ingo!
> > > >>>
> > > >>> Best
> > > >>> Yun Tang
> > > >>> 
> > > >>> From: Yuepeng Pan 
> > > >>> Sent: Friday, December 3, 2021 14:14
> > > >>> To: dev@flink.apache.org 
> > > >>> Cc: Ingo Bürk 
> > > >>> Subject: Re:Re: [ANNOUNCE] New Apache Flink Committer - Ingo Bürk
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> Congratulations, Ingo!
> > > >>>
> > > >>>
> > > >>> Best,
> > > >>> Yuepeng Pan
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> At 2021-12-03 13:47:38, "Yun Gao" 
> > > wrote:
> > >  Congratulations Ingo!
> > > 
> > >  Best,
> > >  Yun
> > > 
> > > 
> > >  --
> > >  From:刘建刚 
> > >  Send Time:2021 Dec. 3 (Fri.) 11:52
> > >  To:dev 
> > >  Cc:"Ingo Bürk" 
> > >  Subject:Re: [ANNOUNCE] New Apache Flink Committer - Ingo Bürk
> > > 
> > >  Congratulations!
> > > 
> > >  Best,
> > >  Liu Jiangang
> > > 
> > >  Till Rohrmann  于2021年12月2日周四 下午11:24写道:
> > > 
> > > > Hi everyone,
> > > >
> > > > On behalf of the PMC, I'm very happy to announce Ingo Bürk as a new
> > > >>> Flink
> > > > committer.
> > > >
> > > > Ingo has started contributing to Flink since the beginning of this
> > > >>> year. He
> > > > worked mostly on SQL components. He has authored many PRs and
> > helped
> > > >>> review
> > > > a lot of other PRs in this area. He actively reported issues and
> > > >> helped
> > > >>> our
> > > > users on the MLs. His most notable contributions were Support SQL
> > > 2016
> > > >>> JSON
> > > > functions in Flink SQL (FLIP-90), Register sources/sinks in Table
> > API
> > > > (FLIP-129) and various other contributions in the SQL area.
> > Moreover,
> > > >>> he is
> > > > one of the few people in our community who actually understands
> > > >> Flink's
> > > > frontend.
> > > >
> > > > Please join me in congratulating Ingo for becoming a Flink
> > committer!
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > >>>
> > > >>
> > >
> > >
> >
> > --
> > Best regards,
> > Sergey
> >


Re: [ANNOUNCE] New Apache Flink Committer - Matthias Pohl

2021-12-06 Thread Roman Khachatryan
Congratulations, Matthias!

Regards,
Roman


On Mon, Dec 6, 2021 at 11:04 AM Yang Wang  wrote:
>
> Congratulations, Matthias!
>
> Best,
> Yang
>
> Sergey Nuyanzin  于2021年12月6日周一 下午3:35写道:
>
> > Congratulations, Matthias!
> >
> > On Mon, Dec 6, 2021 at 7:33 AM Leonard Xu  wrote:
> >
> > > Congratulations Matthias!
> > >
> > > Best,
> > > Leonard
> > > > 2021年12月3日 下午11:23,Matthias Pohl  写道:
> > > >
> > > > Thank you! I'm looking forward to continue working with you.
> > > >
> > > > On Fri, Dec 3, 2021 at 7:29 AM Jingsong Li 
> > > wrote:
> > > >
> > > >> Congratulations, Matthias!
> > > >>
> > > >> On Fri, Dec 3, 2021 at 2:13 PM Yuepeng Pan  wrote:
> > > >>>
> > > >>> Congratulations Matthias!
> > > >>>
> > > >>> Best,Yuepeng Pan.
> > > >>> 在 2021-12-03 13:47:20,"Yun Gao"  写道:
> > >  Congratulations Matthias!
> > > 
> > >  Best,
> > >  Yun
> > > 
> > > 
> > >  --
> > >  From:Jing Zhang 
> > >  Send Time:2021 Dec. 3 (Fri.) 13:45
> > >  To:dev 
> > >  Cc:Matthias Pohl 
> > >  Subject:Re: [ANNOUNCE] New Apache Flink Committer - Matthias Pohl
> > > 
> > >  Congratulations, Matthias!
> > > 
> > >  刘建刚  于2021年12月3日周五 11:51写道:
> > > 
> > > > Congratulations!
> > > >
> > > > Best,
> > > > Liu Jiangang
> > > >
> > > > Till Rohrmann  于2021年12月2日周四 下午11:28写道:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> On behalf of the PMC, I'm very happy to announce Matthias Pohl as
> > a
> > > >> new
> > > >> Flink committer.
> > > >>
> > > >> Matthias has worked on Flink since August last year. He helped
> > > >> review a
> > > > ton
> > > >> of PRs. He worked on a variety of things but most notably the
> > > >> tracking
> > > > and
> > > >> reporting of concurrent exceptions, fixing HA bugs and deprecating
> > > >> and
> > > >> removing our Mesos support. He actively reports issues helping
> > > >> Flink to
> > > >> improve and he is actively engaged in Flink's MLs.
> > > >>
> > > >> Please join me in congratulating Matthias for becoming a Flink
> > > >> committer!
> > > >>
> > > >> Cheers,
> > > >> Till
> > > >>
> > > >
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Best, Jingsong Lee
> > >
> > >
> >
> > --
> > Best regards,
> > Sergey
> >


Re: [ANNOUNCE] New Apache Flink Committer - Ingo Bürk

2021-12-06 Thread Yang Wang
Congratulations, Ingo!

Best,
Yang

Sergey Nuyanzin  于2021年12月6日周一 下午3:35写道:

> Congratulations, Ingo!
>
> On Mon, Dec 6, 2021 at 7:32 AM Leonard Xu  wrote:
>
> > Congratulations, Ingo! Well Deserved.
> >
> > Best,
> > Leonard
> >
> > > 2021年12月3日 下午11:24,Ingo Bürk  写道:
> > >
> > > Thank you everyone for the warm welcome!
> > >
> > >
> > > Best
> > > Ingo
> > >
> > > On Fri, Dec 3, 2021 at 11:47 AM Ryan Skraba
>  > >
> > > wrote:
> > >
> > >> Congratulations Ingo!
> > >>
> > >> On Fri, Dec 3, 2021 at 8:17 AM Yun Tang  wrote:
> > >>
> > >>> Congratulations, Ingo!
> > >>>
> > >>> Best
> > >>> Yun Tang
> > >>> 
> > >>> From: Yuepeng Pan 
> > >>> Sent: Friday, December 3, 2021 14:14
> > >>> To: dev@flink.apache.org 
> > >>> Cc: Ingo Bürk 
> > >>> Subject: Re:Re: [ANNOUNCE] New Apache Flink Committer - Ingo Bürk
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> Congratulations, Ingo!
> > >>>
> > >>>
> > >>> Best,
> > >>> Yuepeng Pan
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> At 2021-12-03 13:47:38, "Yun Gao" 
> > wrote:
> >  Congratulations Ingo!
> > 
> >  Best,
> >  Yun
> > 
> > 
> >  --
> >  From:刘建刚 
> >  Send Time:2021 Dec. 3 (Fri.) 11:52
> >  To:dev 
> >  Cc:"Ingo Bürk" 
> >  Subject:Re: [ANNOUNCE] New Apache Flink Committer - Ingo Bürk
> > 
> >  Congratulations!
> > 
> >  Best,
> >  Liu Jiangang
> > 
> >  Till Rohrmann  于2021年12月2日周四 下午11:24写道:
> > 
> > > Hi everyone,
> > >
> > > On behalf of the PMC, I'm very happy to announce Ingo Bürk as a new
> > >>> Flink
> > > committer.
> > >
> > > Ingo has started contributing to Flink since the beginning of this
> > >>> year. He
> > > worked mostly on SQL components. He has authored many PRs and
> helped
> > >>> review
> > > a lot of other PRs in this area. He actively reported issues and
> > >> helped
> > >>> our
> > > users on the MLs. His most notable contributions were Support SQL
> > 2016
> > >>> JSON
> > > functions in Flink SQL (FLIP-90), Register sources/sinks in Table
> API
> > > (FLIP-129) and various other contributions in the SQL area.
> Moreover,
> > >>> he is
> > > one of the few people in our community who actually understands
> > >> Flink's
> > > frontend.
> > >
> > > Please join me in congratulating Ingo for becoming a Flink
> committer!
> > >
> > > Cheers,
> > > Till
> > >
> > >>>
> > >>
> >
> >
>
> --
> Best regards,
> Sergey
>


Re: [ANNOUNCE] New Apache Flink Committer - Matthias Pohl

2021-12-06 Thread Yang Wang
Congratulations, Matthias!

Best,
Yang

Sergey Nuyanzin  于2021年12月6日周一 下午3:35写道:

> Congratulations, Matthias!
>
> On Mon, Dec 6, 2021 at 7:33 AM Leonard Xu  wrote:
>
> > Congratulations Matthias!
> >
> > Best,
> > Leonard
> > > 2021年12月3日 下午11:23,Matthias Pohl  写道:
> > >
> > > Thank you! I'm looking forward to continue working with you.
> > >
> > > On Fri, Dec 3, 2021 at 7:29 AM Jingsong Li 
> > wrote:
> > >
> > >> Congratulations, Matthias!
> > >>
> > >> On Fri, Dec 3, 2021 at 2:13 PM Yuepeng Pan  wrote:
> > >>>
> > >>> Congratulations Matthias!
> > >>>
> > >>> Best,Yuepeng Pan.
> > >>> 在 2021-12-03 13:47:20,"Yun Gao"  写道:
> >  Congratulations Matthias!
> > 
> >  Best,
> >  Yun
> > 
> > 
> >  --
> >  From:Jing Zhang 
> >  Send Time:2021 Dec. 3 (Fri.) 13:45
> >  To:dev 
> >  Cc:Matthias Pohl 
> >  Subject:Re: [ANNOUNCE] New Apache Flink Committer - Matthias Pohl
> > 
> >  Congratulations, Matthias!
> > 
> >  刘建刚  于2021年12月3日周五 11:51写道:
> > 
> > > Congratulations!
> > >
> > > Best,
> > > Liu Jiangang
> > >
> > > Till Rohrmann  于2021年12月2日周四 下午11:28写道:
> > >
> > >> Hi everyone,
> > >>
> > >> On behalf of the PMC, I'm very happy to announce Matthias Pohl as
> a
> > >> new
> > >> Flink committer.
> > >>
> > >> Matthias has worked on Flink since August last year. He helped
> > >> review a
> > > ton
> > >> of PRs. He worked on a variety of things but most notably the
> > >> tracking
> > > and
> > >> reporting of concurrent exceptions, fixing HA bugs and deprecating
> > >> and
> > >> removing our Mesos support. He actively reports issues helping
> > >> Flink to
> > >> improve and he is actively engaged in Flink's MLs.
> > >>
> > >> Please join me in congratulating Matthias for becoming a Flink
> > >> committer!
> > >>
> > >> Cheers,
> > >> Till
> > >>
> > >
> > >>
> > >>
> > >>
> > >> --
> > >> Best, Jingsong Lee
> >
> >
>
> --
> Best regards,
> Sergey
>


Re: [DISCUSS] FLIP-194: Introduce the JobResultStore

2021-12-06 Thread Yang Wang
Thanks for the fruitful discussion. I also hope that we could remove all
the pointers in the HA store(ZK, ConfigMap) in the future.
After then, we only rely on the ZK/ConfigMap for leader election/retrieval.


Best,
Yang

David Morávek  于2021年12月6日周一 下午4:57写道:

> as all of the concerns seems to be addressed, I'd like to proceed with the
> vote to move things forward.
>
> Thanks everyone for the feedback, it was really helpful!
>
> Best,
> D.
>
> On Wed, Dec 1, 2021 at 6:39 AM Zhu Zhu  wrote:
>
> > Thanks for the explanation Matthias. The solution sounds good to me.
> > I have no more concerns and +1 for the FLIP.
> >
> > Thanks,
> > Zhu
> >
> > Xintong Song  于2021年12月1日周三 下午12:56写道:
> >
> > > @David,
> > >
> > > Thanks for the clarification.
> > >
> > > No more concerns from my side. +1 for this FLIP.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Wed, Dec 1, 2021 at 12:28 AM Till Rohrmann 
> > > wrote:
> > >
> > > > Given the other breaking changes, I think that it is ok to remove the
> > > > `RunningJobsRegistry` completely.
> > > >
> > > > Since we allow users to specify a HighAvailabilityServices
> > implementation
> > > > when starting Flink via `high-availability: FQDN`, I think we should
> > mark
> > > > the interface at least @Experimental.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Tue, Nov 30, 2021 at 2:29 PM Mika Naylor 
> wrote:
> > > >
> > > > > Hi Till,
> > > > >
> > > > > We thought that breaking interfaces, specifically
> > > > > HighAvailabilityServices and RunningJobsRegistry, was acceptable in
> > > this
> > > > > instance because:
> > > > >
> > > > > - Neither of these interfaces are marked @Public and so carry no
> > > > >guarantees about being public and stable.
> > > > > - As far as we are aware, we currently have no users with custom
> > > > >HighAvailabilityServices implementations.
> > > > > - The interface was already broken in 1.14 with the changes to
> > > > >CheckpointRecoveryFactory, and will likely be changed again in
> > 1.15
> > > > >due to further changes in that factory.
> > > > >
> > > > > Given that, we thought changes to the interface would not be
> > > disruptive.
> > > > > Perhaps it could be annotated as @Internal - I'm not sure exactly
> > what
> > > > > guarantees we try and give for the stability of the
> > > > > HighAvailabilityServices interface.
> > > > >
> > > > > Kind regards,
> > > > > Mika
> > > > >
> > > > > On 26.11.2021 18:28, Till Rohrmann wrote:
> > > > > >Thanks for creating this FLIP Matthias, Mika and David.
> > > > > >
> > > > > >I think the JobResultStore is an important piece for fixing
> Flink's
> > > last
> > > > > >high-availability problem (afaik). Once we have this piece in
> place,
> > > > users
> > > > > >no longer risk to re-execute a successfully completed job.
> > > > > >
> > > > > >I have one comment concerning breaking interfaces:
> > > > > >
> > > > > >If we don't want to break interfaces, then we could keep the
> > > > > >HighAvailabilityServices.getRunningJobsRegistry() method and add a
> > > > default
> > > > > >implementation for HighAvailabilityServices.getJobResultStore().
> We
> > > > could
> > > > > >then deprecate the former method and then remove it in the
> > subsequent
> > > > > >release (1.16).
> > > > > >
> > > > > >Apart from that, +1 for the FLIP.
> > > > > >
> > > > > >Cheers,
> > > > > >Till
> > > > > >
> > > > > >On Wed, Nov 17, 2021 at 6:05 PM David Morávek 
> > > wrote:
> > > > > >
> > > > > >> Hi everyone,
> > > > > >>
> > > > > >> Matthias, Mika and I want to start a discussion about
> introduction
> > > of
> > > > a
> > > > > new
> > > > > >> Flink component, the *JobResultStore*.
> > > > > >>
> > > > > >> The main motivation is to address shortcomings of the
> > > > > *RunningJobsRegistry*
> > > > > >> and surpass it with the new component. These shortcomings have
> > been
> > > > > first
> > > > > >> described in FLINK-11813 [1].
> > > > > >>
> > > > > >> This change should improve the overall stability of the
> > JobManager's
> > > > > >> components and address the race conditions in some of the fail
> > over
> > > > > >> scenarios during the job cleanup lifecycle.
> > > > > >>
> > > > > >> It should also help to ensure that Flink doesn't leave any
> > uncleaned
> > > > > >> resources behind.
> > > > > >>
> > > > > >> We've prepared a FLIP-194 [2], which outlines the design and
> > > reasoning
> > > > > >> behind this new component.
> > > > > >>
> > > > > >> [1] https://issues.apache.org/jira/browse/FLINK-11813
> > > > > >> [2]
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726435
> > > > > >>
> > > > > >> We're looking forward for your feedback ;)
> > > > > >>
> > > > > >> Best,
> > > > > >> Matthias, Mika and David
> > > > > >>
> > > > >
> > > > > Mika Naylor
> > > > > https://autophagy.io
> > > > >
> > > >
> > >
> >
>


[jira] [Created] (FLINK-25189) Update Elasticsearch Sinks to latest minor versions

2021-12-06 Thread Alexander Preuss (Jira)
Alexander Preuss created FLINK-25189:


 Summary: Update Elasticsearch Sinks to latest minor versions
 Key: FLINK-25189
 URL: https://issues.apache.org/jira/browse/FLINK-25189
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / ElasticSearch
Affects Versions: 1.15.0
Reporter: Alexander Preuss


We want to bump the elasticsearch dependencies used in the 
elasticsearch-connector modules to their latest respective minor versions 
(6.8.20 and 7.15.2)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-06 Thread Yingjie Cao
Hi Till,

Thanks for your feedback.

>>> How will our tests be affected by these changes? Will Flink require
more resources and, thus, will it risk destabilizing our testing
infrastructure?

There are some tests that need to be adjusted, for example,
BlockingShuffleITCase. For other tests, theoretically, the influence should
be small. I will further run all tests multiple times (like 10 or 20) to
ensure that there is no test stability issues before making the change.

>>> I would propose to create a FLIP for these changes since you propose to
change the default behaviour. It can be a very short one, though.

Yes, you are right. I will prepare a simple FLIP soon.

Best,
Yingjie


Till Rohrmann  于2021年12月3日周五 18:39写道:

> Thanks for starting this discussion Yingjie,
>
> How will our tests be affected by these changes? Will Flink require more
> resources and, thus, will it risk destabilizing our testing infrastructure?
>
> I would propose to create a FLIP for these changes since you propose to
> change the default behaviour. It can be a very short one, though.
>
> Cheers,
> Till
>
> On Fri, Dec 3, 2021 at 10:02 AM Yingjie Cao 
> wrote:
>
>> Hi dev & users,
>>
>> We propose to change some default values of blocking shuffle to improve
>> the user out-of-box experience (not influence streaming). The default
>> values we want to change are as follows:
>>
>> 1. Data compression
>> (taskmanager.network.blocking-shuffle.compression.enabled): Currently, the
>> default value is 'false'.  Usually, data compression can reduce both disk
>> and network IO which is good for performance. At the same time, it can save
>> storage space. We propose to change the default value to true.
>>
>> 2. Default shuffle implementation
>> (taskmanager.network.sort-shuffle.min-parallelism): Currently, the default
>> value is 'Integer.MAX', which means by default, Flink jobs will always use
>> hash-shuffle. In fact, for high parallelism, sort-shuffle is better for
>> both stability and performance. So we propose to reduce the default value
>> to a proper smaller one, for example, 128. (We tested 128, 256, 512 and
>> 1024 with a tpc-ds and 128 is the best one.)
>>
>> 3. Read buffer of sort-shuffle
>> (taskmanager.memory.framework.off-heap.batch-shuffle.size): Currently, the
>> default value is '32M'. Previously, when choosing the default value, both
>> ‘32M' and '64M' are OK for tests and we chose the smaller one in a cautious
>> way. However, recently, it is reported in the mailing list that the default
>> value is not enough which caused a buffer request timeout issue. We already
>> created a ticket to improve the behavior. At the same time, we propose to
>> increase this default value to '64M' which can also help.
>>
>> 4. Sort buffer size of sort-shuffle
>> (taskmanager.network.sort-shuffle.min-buffers): Currently, the default
>> value is '64' which means '64' network buffers (32k per buffer by default).
>> This default value is quite modest and the performance can be influenced.
>> We propose to increase this value to a larger one, for example, 512 (the
>> default TM and network buffer configuration can serve more than 10
>> result partitions concurrently).
>>
>> We already tested these default values together with tpc-ds benchmark in
>> a cluster and both the performance and stability improved a lot. These
>> changes can help to improve the out-of-box experience of blocking shuffle.
>> What do you think about these changes? Is there any concern? If there are
>> no objections, I will make these changes soon.
>>
>> Best,
>> Yingjie
>>
>


[jira] [Created] (FLINK-25188) Cannot install PyFlink in M1 CPU

2021-12-06 Thread Ada Wong (Jira)
Ada Wong created FLINK-25188:


 Summary: Cannot install PyFlink in M1 CPU
 Key: FLINK-25188
 URL: https://issues.apache.org/jira/browse/FLINK-25188
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.14.0
Reporter: Ada Wong


ERROR: Could not find a version that satisfies the requirement 
pandas<1.2.0,>=1.0 (from apache-flink)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] FLIP-197: API stability graduation process

2021-12-06 Thread Till Rohrmann
Ok, then lets increase the graduation period to 2 releases. If we see that
this is super easy for us to do, then we can shorten it in the future.

Cheers,
Till

On Mon, Dec 6, 2021 at 9:54 AM Chesnay Schepler  wrote:

> Given that you can delay the graduation if there is a good reason for
> it, we should be able to cover that case even if the graduation would
> happen by default after 1 month.
>
> That said, personally I would also be in favor of 2 releases; we see
> plenty of users not upgrading to every single Flink version, and this
> may  give us a bit more coverage.
>
> On 06/12/2021 09:20, Ingo Bürk wrote:
> > Hi Till,
> >
> > from my (admittedly limited) experience with how far projects lag behind
> in
> > terms of Flink versions – yes, the combined time it would take to mature
> > then seems reasonable enough for a sufficient adoption, IMO.
> >
> > Another reason why I think two releases as a default for the last step
> > makes sense: say you mature an API to PublicEvolving. Typically, there
> will
> > be issues found afterwards. Even if you address these in the very next
> > release cycle, a duration of one release would mean you fully mature the
> > API in the same release in which things are still being fixed;
> intuitively,
> > it makes sense to me that the step to Public would come after a period of
> > no changes needed, however.
> >
> >
> > Ingo
> >
> > On Fri, Dec 3, 2021 at 4:55 PM Till Rohrmann 
> wrote:
> >
> >> Hi Ingo, thanks for your feedback.
> >>
> >> Do you think that two release cycles per graduation step would be long
> >> enough or should it be longer?
> >>
> >> Cheers,
> >> Till
> >>
> >> On Fri, Dec 3, 2021 at 4:29 PM Ingo Bürk  wrote:
> >>
> >>> Hi Till,
> >>>
> >>> Overall I whole-heartedly agree with the proposals in this FLIP. Thank
> >> you
> >>> for starting this discussion as well! This seems like something that
> >> could
> >>> be tested quite nicely with ArchUnit as well; I'll be happy to help
> >> should
> >>> the FLIP be accepted.
> >>>
>  I would propose per default a single release.
> >>> The step from PublicEvolving to Public feels more important to me, and
> I
> >>> would personally suggest making this transition a bit longer. We have a
> >> bit
> >>> of a chicken-egg problem here, because the goal of your FLIP is,
> >>> ultimately, also to motivate faster adoption of new Flink versions, but
> >> the
> >>> status quo prevents that; if we mature APIs too quickly, we risk losing
> >> out
> >>> on important feedback. Therefore, I would propose starting slower here,
> >> and
> >>> rather think about shortening that cycle in the future.
> >>>
> >>>
> >>> Best
> >>> Ingo
> >>>
> >>> On Thu, Dec 2, 2021 at 3:57 PM Till Rohrmann 
> >> wrote:
>  Hi everyone,
> 
>  As promised, here is the follow-up FLIP [1] for discussing how we can
>  ensure that newly introduced APIs are being stabilized over time. This
> >>> FLIP
>  is related to FLIP-196 [2].
> 
>  The idea of FLIP-197 is to introduce an API graduation process that
> >>> forces
>  us to increase the API stability guarantee unless there is a very good
>  reason not to do so. So the proposal is to reverse the process from
> >>> opt-in
>  (increasing the stability guarantee explicitly) to opt-out (deciding
> >> that
>  an API cannot be graduated with a good reason).
> 
>  Since every process breaks if it is not automated, we propose a richer
> >>> set
>  of API stability annotations that can capture enough information so
> >> that
> >>> we
>  can implement a test that fails if we fail to follow the process.
> 
>  Looking forward to your feedback.
> 
>  Hopefully, we can provide our users a better experience when working
> >> with
>  Flink because we offer more stable APIs and make them available
> faster.
> 
>  [1] https://cwiki.apache.org/confluence/x/J5eqCw
>  [2] https://cwiki.apache.org/confluence/x/IJeqCw
> 
>  Cheers,
>  Till
> 
>
>


Re: [DISCUSS] FLIP-196: Source API stability guarantees

2021-12-06 Thread Ingo Bürk
Hi Till,

seems I misunderstood it then; thanks for the clarification! And yes, with
that I would fully agree.


Ingo

On Mon, Dec 6, 2021 at 9:59 AM Till Rohrmann  wrote:

> Hi Ingo,
>
> No, the added method can have a weaker stability guarantee as long as the
> user does not have to implement it. In order to give an example the
> following extension would be ok imo:
>
> @Public
> interface Foobar {
> @Public
> int foo();
>
> @Experimental
> default ExperimentalResult bar() {
>   return ExperimentalResult.notSupported();
> }
> }
>
> The following extension would not be ok because here the user needs to
> implement something new:
>
> @Public
> interface Foobar {
> @Public
> int foo();
>
> @Experimental
> ExperimentalResult bar();
> }
>
> Moreover, if the user uses bar(), then he opts-in to only get @Experimental
> stability guarantees.
>
> I will add this example to the FLIP for illustrative purposes.
>
> Cheers,
> Till
>
> On Fri, Dec 3, 2021 at 6:52 PM Ingo Bürk  wrote:
>
> > > Would it be enough to say that for example all classes in the module
> > flink-java have to be annotated? What I would like to avoid is having to
> > annotate all classes in some internal module like flink-rpc.
> >
> > I don't think it is, but we certainly could restrict it to certain top
> > level o.a.f.xyz packages.
> >
> > > Extending existing classes will only be possible if you can provide a
> > default implementation
> >
> > That I'm totally fine with, but based on that sentence in the FLIP if I
> > have a public interface and extend it, even with a default
> implementation,
> > I _have_ to have this method be stable already as well, right? I couldn't
> > for example add an experimental method to an interface.
> >
> > This would also include all classes used as argument and return type of
> > such methods too, which seems quite restrictive.
> >
> >
> > Best
> > Ingo
> >
> > On Fri, Dec 3, 2021, 17:51 Till Rohrmann  wrote:
> >
> > > >That's still a much weaker requirement, though, as classes can just be
> > > left unannotated, which is why I prefer annotating all classes
> regardless
> > > of location.
> > >
> > > Would it be enough to say that for example all classes in the module
> > > flink-java have to be annotated? What I would like to avoid is having
> to
> > > annotate all classes in some internal module like flink-rpc.
> > >
> > > > How would you handle e.g. extending an existing Public interface
> with a
> > > new method in this case, though? You'd be forced to immediately make
> the
> > > new method Public as well, or place it somewhere else entirely, which
> > leads
> > > to unfavorable design. I don't think we should disallow extending
> classes
> > > with methods of a weaker stability.
> > >
> > > Extending existing classes will only be possible if you can provide a
> > > default implementation. If the user needs to do something, then it is
> not
> > > compatible and needs to be handled differently (e.g. by offering a new
> > > experimental interface that one can use). If we don't enforce this,
> then
> > I
> > > don't see how we can provide source stability guarantees.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Fri, Dec 3, 2021 at 5:22 PM Ingo Bürk  wrote:
> > >
> > > > Hi Till,
> > > >
> > > > > Personally, I'd be fine to say that in API modules (tbd what this
> is
> > > > > (probably transitive closure of all APIs)) we require every class
> to
> > be
> > > > > annotated.
> > > >
> > > > At least we'll then need the reverse rule: no classes outside *.api.*
> > > > packages CAN have an API annotation (other than Internal), of course
> > with
> > > > many existing violations that need to be accapted.
> > > >
> > > > That's still a much weaker requirement, though, as classes can just
> be
> > > left
> > > > unannotated, which is why I prefer annotating all classes regardless
> of
> > > > location.
> > > >
> > > > > If we have cases that violate the guideline, then I think we either
> > > have
> > > > to
> > > > > remove these methods
> > > >
> > > > How would you handle e.g. extending an existing Public interface
> with a
> > > new
> > > > method in this case, though? You'd be forced to immediately make the
> > new
> > > > method Public as well, or place it somewhere else entirely, which
> leads
> > > to
> > > > unfavorable design. I don't think we should disallow extending
> classes
> > > with
> > > > methods of a weaker stability.
> > > >
> > > >
> > > > Best
> > > > Ingo
> > > >
> > > > On Fri, Dec 3, 2021 at 4:53 PM Till Rohrmann 
> > > wrote:
> > > >
> > > > > Hi Ingo, thanks a lot for your thoughts and the work you've spent
> on
> > > this
> > > > > topic already.
> > > > >
> > > > > Personally, I'd be fine to say that in API modules (tbd what this
> is
> > > > > (probably transitive closure of all APIs)) we require every class
> to
> > be
> > > > > annotated.
> > > > >
> > > > > For sake of clarity and having clear rules, this should then also
> > apply
> > > > to

Re: [VOTE] FLIP-194: Introduce the JobResultStore

2021-12-06 Thread Till Rohrmann
+1 (binding)

Cheers,
Till

On Mon, Dec 6, 2021 at 10:00 AM David Morávek  wrote:

> Hi everyone,
>
> I'd like to open a vote on FLIP-194: Introduce the JobResultStore [1] which
> has been
> discussed in this thread [2].
> The vote will be open for at least 72 hours unless there is an objection or
> not enough votes.
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726435
> [2] https://lists.apache.org/thread/wlzv02jqtq221kb8dnm82v4xj8tomd94
>
> Best,
> D.
>


[VOTE] FLIP-194: Introduce the JobResultStore

2021-12-06 Thread David Morávek
Hi everyone,

I'd like to open a vote on FLIP-194: Introduce the JobResultStore [1] which
has been
discussed in this thread [2].
The vote will be open for at least 72 hours unless there is an objection or
not enough votes.

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726435
[2] https://lists.apache.org/thread/wlzv02jqtq221kb8dnm82v4xj8tomd94

Best,
D.


Re: [DISCUSS] FLIP-196: Source API stability guarantees

2021-12-06 Thread Till Rohrmann
Hi Ingo,

No, the added method can have a weaker stability guarantee as long as the
user does not have to implement it. In order to give an example the
following extension would be ok imo:

@Public
interface Foobar {
@Public
int foo();

@Experimental
default ExperimentalResult bar() {
  return ExperimentalResult.notSupported();
}
}

The following extension would not be ok because here the user needs to
implement something new:

@Public
interface Foobar {
@Public
int foo();

@Experimental
ExperimentalResult bar();
}

Moreover, if the user uses bar(), then he opts-in to only get @Experimental
stability guarantees.

I will add this example to the FLIP for illustrative purposes.

Cheers,
Till

On Fri, Dec 3, 2021 at 6:52 PM Ingo Bürk  wrote:

> > Would it be enough to say that for example all classes in the module
> flink-java have to be annotated? What I would like to avoid is having to
> annotate all classes in some internal module like flink-rpc.
>
> I don't think it is, but we certainly could restrict it to certain top
> level o.a.f.xyz packages.
>
> > Extending existing classes will only be possible if you can provide a
> default implementation
>
> That I'm totally fine with, but based on that sentence in the FLIP if I
> have a public interface and extend it, even with a default implementation,
> I _have_ to have this method be stable already as well, right? I couldn't
> for example add an experimental method to an interface.
>
> This would also include all classes used as argument and return type of
> such methods too, which seems quite restrictive.
>
>
> Best
> Ingo
>
> On Fri, Dec 3, 2021, 17:51 Till Rohrmann  wrote:
>
> > >That's still a much weaker requirement, though, as classes can just be
> > left unannotated, which is why I prefer annotating all classes regardless
> > of location.
> >
> > Would it be enough to say that for example all classes in the module
> > flink-java have to be annotated? What I would like to avoid is having to
> > annotate all classes in some internal module like flink-rpc.
> >
> > > How would you handle e.g. extending an existing Public interface with a
> > new method in this case, though? You'd be forced to immediately make the
> > new method Public as well, or place it somewhere else entirely, which
> leads
> > to unfavorable design. I don't think we should disallow extending classes
> > with methods of a weaker stability.
> >
> > Extending existing classes will only be possible if you can provide a
> > default implementation. If the user needs to do something, then it is not
> > compatible and needs to be handled differently (e.g. by offering a new
> > experimental interface that one can use). If we don't enforce this, then
> I
> > don't see how we can provide source stability guarantees.
> >
> > Cheers,
> > Till
> >
> > On Fri, Dec 3, 2021 at 5:22 PM Ingo Bürk  wrote:
> >
> > > Hi Till,
> > >
> > > > Personally, I'd be fine to say that in API modules (tbd what this is
> > > > (probably transitive closure of all APIs)) we require every class to
> be
> > > > annotated.
> > >
> > > At least we'll then need the reverse rule: no classes outside *.api.*
> > > packages CAN have an API annotation (other than Internal), of course
> with
> > > many existing violations that need to be accapted.
> > >
> > > That's still a much weaker requirement, though, as classes can just be
> > left
> > > unannotated, which is why I prefer annotating all classes regardless of
> > > location.
> > >
> > > > If we have cases that violate the guideline, then I think we either
> > have
> > > to
> > > > remove these methods
> > >
> > > How would you handle e.g. extending an existing Public interface with a
> > new
> > > method in this case, though? You'd be forced to immediately make the
> new
> > > method Public as well, or place it somewhere else entirely, which leads
> > to
> > > unfavorable design. I don't think we should disallow extending classes
> > with
> > > methods of a weaker stability.
> > >
> > >
> > > Best
> > > Ingo
> > >
> > > On Fri, Dec 3, 2021 at 4:53 PM Till Rohrmann 
> > wrote:
> > >
> > > > Hi Ingo, thanks a lot for your thoughts and the work you've spent on
> > this
> > > > topic already.
> > > >
> > > > Personally, I'd be fine to say that in API modules (tbd what this is
> > > > (probably transitive closure of all APIs)) we require every class to
> be
> > > > annotated.
> > > >
> > > > For sake of clarity and having clear rules, this should then also
> apply
> > > to
> > > > nested types. As you've said this would have some additional benefits
> > > when
> > > > doing refactorings and seems to be actually required by japicmp.
> > > >
> > > > >This seems to be quite incompatible with the current interpretation
> in
> > > the
> > > > code base, and it would prevent valid (and in-use) use cases like
> > marking
> > > > e.g. a single method experimental (or even internal) in an otherwise
> > > public
> > > > (evolving) API.
> > > >
> > > > 

Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs

2021-12-06 Thread godfrey he
Hi, Timo,

Thanks for the detailed explanation.

> We change an operator state of B in Flink 1.16. We perform the change in the 
> operator of B in a way to support both state layouts. Thus, no need for a new 
> ExecNode version.

I think this design makes thing more complex.
1. If there are multiple state layouts, which layout the ExecNode should use ?
It increases the cost of understanding for developers (especially for
Flink newer),
making them prone to mistakes.
2. `supportedPlanChanges ` and `supportedSavepointChanges ` are a bit obscure.

The purpose of ExecNode annotations are not only to support powerful validation,
but more importantly to make it easy for developers to understand
to ensure that every modification is easy and state compatible.

I prefer, once the state layout is changed, the ExecNode version needs
also be updated.
which could make thing simple. How about
rename `supportedPlanChanges ` to `planCompatibleVersion`
(which means the plan is compatible with the plan generated by the
given version node)
 and rename `supportedSavepointChanges` to `savepointCompatibleVersion `
(which means the state is compatible with the state generated by the
given version node) ?
The names also indicate that only one version value can be set.

WDYT?

Best,
Godfrey









Timo Walther  于2021年12月2日周四 下午11:42写道:
>
> Response to Marios's feedback:
>
>  > there should be some good logging in place when the upgrade is taking
> place
>
> Yes, I agree. I added this part to the FLIP.
>
>  > config option instead that doesn't provide the flexibility to
> overwrite certain plans
>
> One can set the config option also around sections of the
> multi-statement SQL script.
>
> SET 'table.plan.force-recompile'='true';
>
> COMPILE ...
>
> SET 'table.plan.force-recompile'='false';
>
> But the question is why a user wants to run COMPILE multiple times. If
> it is during development, then running EXECUTE (or just the statement
> itself) without calling COMPILE should be sufficient. The file can also
> manually be deleted if necessary.
>
> What do you think?
>
> Regards,
> Timo
>
>
>
> On 02.12.21 16:09, Timo Walther wrote:
> > Hi Till,
> >
> > Yes, you might have to. But not a new plan from the SQL query but a
> > migration from the old plan to the new plan. This will not happen often.
> > But we need a way to evolve the format of the JSON plan itself.
> >
> > Maybe this confuses a bit, so let me clarify it again: Mostly ExecNode
> > versions and operator state layouts will evolve. Not the plan files,
> > those will be pretty stable. But also not infinitely.
> >
> > Regards,
> > Timo
> >
> >
> > On 02.12.21 16:01, Till Rohrmann wrote:
> >> Then for migrating from Flink 1.10 to 1.12, I might have to create a new
> >> plan using Flink 1.11 in order to migrate from Flink 1.11 to 1.12, right?
> >>
> >> Cheers,
> >> Till
> >>
> >> On Thu, Dec 2, 2021 at 3:39 PM Timo Walther  wrote:
> >>
> >>> Response to Till's feedback:
> >>>
> >>>   > compiled plan won't be changed after being written initially
> >>>
> >>> This is not entirely correct. We give guarantees for keeping the query
> >>> up and running. We reserve us the right to force plan migrations. In
> >>> this case, the plan might not be created from the SQL statement but from
> >>> the old plan. I have added an example in section 10.1.1. In general,
> >>> both persisted entities "plan" and "savepoint" can evolve independently
> >>> from each other.
> >>>
> >>> Thanks,
> >>> Timo
> >>>
> >>> On 02.12.21 15:10, Timo Walther wrote:
>  Response to Godfrey's feedback:
> 
>    > "EXPLAIN PLAN EXECUTE STATEMENT SET BEGIN ... END" is missing.
> 
>  Thanks for the hint. I added a dedicated section 7.1.3.
> 
> 
>    > it's hard to maintain the supported versions for
>  "supportedPlanChanges" and "supportedSavepointChanges"
> 
>  Actually, I think we are mostly on the same page.
> 
>  The annotation does not need to be updated for every Flink version. As
>  the name suggests it is about "Changes" (in other words:
>  incompatibilities) that require some kind of migration. Either plan
>  migration (= PlanChanges) or savepoint migration (=SavepointChanges,
>  using operator migration or savepoint migration).
> 
>  Let's assume we introduced two ExecNodes A and B in Flink 1.15.
> 
>  The annotations are:
> 
>  @ExecNodeMetadata(name=A, supportedPlanChanges=1.15,
>  supportedSavepointChanges=1.15)
> 
>  @ExecNodeMetadata(name=B, supportedPlanChanges=1.15,
>  supportedSavepointChanges=1.15)
> 
>  We change an operator state of B in Flink 1.16.
> 
>  We perform the change in the operator of B in a way to support both
>  state layouts. Thus, no need for a new ExecNode version.
> 
>  The annotations in 1.16 are:
> 
>  @ExecNodeMetadata(name=A, supportedPlanChanges=1.15,
>  supportedSavepointChanges=1.15)
> 
>  @ExecNodeMetadata(name=B, 

Re: [DISCUSS] FLIP-194: Introduce the JobResultStore

2021-12-06 Thread David Morávek
as all of the concerns seems to be addressed, I'd like to proceed with the
vote to move things forward.

Thanks everyone for the feedback, it was really helpful!

Best,
D.

On Wed, Dec 1, 2021 at 6:39 AM Zhu Zhu  wrote:

> Thanks for the explanation Matthias. The solution sounds good to me.
> I have no more concerns and +1 for the FLIP.
>
> Thanks,
> Zhu
>
> Xintong Song  于2021年12月1日周三 下午12:56写道:
>
> > @David,
> >
> > Thanks for the clarification.
> >
> > No more concerns from my side. +1 for this FLIP.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Wed, Dec 1, 2021 at 12:28 AM Till Rohrmann 
> > wrote:
> >
> > > Given the other breaking changes, I think that it is ok to remove the
> > > `RunningJobsRegistry` completely.
> > >
> > > Since we allow users to specify a HighAvailabilityServices
> implementation
> > > when starting Flink via `high-availability: FQDN`, I think we should
> mark
> > > the interface at least @Experimental.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Tue, Nov 30, 2021 at 2:29 PM Mika Naylor  wrote:
> > >
> > > > Hi Till,
> > > >
> > > > We thought that breaking interfaces, specifically
> > > > HighAvailabilityServices and RunningJobsRegistry, was acceptable in
> > this
> > > > instance because:
> > > >
> > > > - Neither of these interfaces are marked @Public and so carry no
> > > >guarantees about being public and stable.
> > > > - As far as we are aware, we currently have no users with custom
> > > >HighAvailabilityServices implementations.
> > > > - The interface was already broken in 1.14 with the changes to
> > > >CheckpointRecoveryFactory, and will likely be changed again in
> 1.15
> > > >due to further changes in that factory.
> > > >
> > > > Given that, we thought changes to the interface would not be
> > disruptive.
> > > > Perhaps it could be annotated as @Internal - I'm not sure exactly
> what
> > > > guarantees we try and give for the stability of the
> > > > HighAvailabilityServices interface.
> > > >
> > > > Kind regards,
> > > > Mika
> > > >
> > > > On 26.11.2021 18:28, Till Rohrmann wrote:
> > > > >Thanks for creating this FLIP Matthias, Mika and David.
> > > > >
> > > > >I think the JobResultStore is an important piece for fixing Flink's
> > last
> > > > >high-availability problem (afaik). Once we have this piece in place,
> > > users
> > > > >no longer risk to re-execute a successfully completed job.
> > > > >
> > > > >I have one comment concerning breaking interfaces:
> > > > >
> > > > >If we don't want to break interfaces, then we could keep the
> > > > >HighAvailabilityServices.getRunningJobsRegistry() method and add a
> > > default
> > > > >implementation for HighAvailabilityServices.getJobResultStore(). We
> > > could
> > > > >then deprecate the former method and then remove it in the
> subsequent
> > > > >release (1.16).
> > > > >
> > > > >Apart from that, +1 for the FLIP.
> > > > >
> > > > >Cheers,
> > > > >Till
> > > > >
> > > > >On Wed, Nov 17, 2021 at 6:05 PM David Morávek 
> > wrote:
> > > > >
> > > > >> Hi everyone,
> > > > >>
> > > > >> Matthias, Mika and I want to start a discussion about introduction
> > of
> > > a
> > > > new
> > > > >> Flink component, the *JobResultStore*.
> > > > >>
> > > > >> The main motivation is to address shortcomings of the
> > > > *RunningJobsRegistry*
> > > > >> and surpass it with the new component. These shortcomings have
> been
> > > > first
> > > > >> described in FLINK-11813 [1].
> > > > >>
> > > > >> This change should improve the overall stability of the
> JobManager's
> > > > >> components and address the race conditions in some of the fail
> over
> > > > >> scenarios during the job cleanup lifecycle.
> > > > >>
> > > > >> It should also help to ensure that Flink doesn't leave any
> uncleaned
> > > > >> resources behind.
> > > > >>
> > > > >> We've prepared a FLIP-194 [2], which outlines the design and
> > reasoning
> > > > >> behind this new component.
> > > > >>
> > > > >> [1] https://issues.apache.org/jira/browse/FLINK-11813
> > > > >> [2]
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726435
> > > > >>
> > > > >> We're looking forward for your feedback ;)
> > > > >>
> > > > >> Best,
> > > > >> Matthias, Mika and David
> > > > >>
> > > >
> > > > Mika Naylor
> > > > https://autophagy.io
> > > >
> > >
> >
>


Re: [DISCUSS] FLIP-197: API stability graduation process

2021-12-06 Thread Chesnay Schepler
Given that you can delay the graduation if there is a good reason for 
it, we should be able to cover that case even if the graduation would 
happen by default after 1 month.


That said, personally I would also be in favor of 2 releases; we see 
plenty of users not upgrading to every single Flink version, and this 
may  give us a bit more coverage.


On 06/12/2021 09:20, Ingo Bürk wrote:

Hi Till,

from my (admittedly limited) experience with how far projects lag behind in
terms of Flink versions – yes, the combined time it would take to mature
then seems reasonable enough for a sufficient adoption, IMO.

Another reason why I think two releases as a default for the last step
makes sense: say you mature an API to PublicEvolving. Typically, there will
be issues found afterwards. Even if you address these in the very next
release cycle, a duration of one release would mean you fully mature the
API in the same release in which things are still being fixed; intuitively,
it makes sense to me that the step to Public would come after a period of
no changes needed, however.


Ingo

On Fri, Dec 3, 2021 at 4:55 PM Till Rohrmann  wrote:


Hi Ingo, thanks for your feedback.

Do you think that two release cycles per graduation step would be long
enough or should it be longer?

Cheers,
Till

On Fri, Dec 3, 2021 at 4:29 PM Ingo Bürk  wrote:


Hi Till,

Overall I whole-heartedly agree with the proposals in this FLIP. Thank

you

for starting this discussion as well! This seems like something that

could

be tested quite nicely with ArchUnit as well; I'll be happy to help

should

the FLIP be accepted.


I would propose per default a single release.

The step from PublicEvolving to Public feels more important to me, and I
would personally suggest making this transition a bit longer. We have a

bit

of a chicken-egg problem here, because the goal of your FLIP is,
ultimately, also to motivate faster adoption of new Flink versions, but

the

status quo prevents that; if we mature APIs too quickly, we risk losing

out

on important feedback. Therefore, I would propose starting slower here,

and

rather think about shortening that cycle in the future.


Best
Ingo

On Thu, Dec 2, 2021 at 3:57 PM Till Rohrmann 

wrote:

Hi everyone,

As promised, here is the follow-up FLIP [1] for discussing how we can
ensure that newly introduced APIs are being stabilized over time. This

FLIP

is related to FLIP-196 [2].

The idea of FLIP-197 is to introduce an API graduation process that

forces

us to increase the API stability guarantee unless there is a very good
reason not to do so. So the proposal is to reverse the process from

opt-in

(increasing the stability guarantee explicitly) to opt-out (deciding

that

an API cannot be graduated with a good reason).

Since every process breaks if it is not automated, we propose a richer

set

of API stability annotations that can capture enough information so

that

we

can implement a test that fails if we fail to follow the process.

Looking forward to your feedback.

Hopefully, we can provide our users a better experience when working

with

Flink because we offer more stable APIs and make them available faster.

[1] https://cwiki.apache.org/confluence/x/J5eqCw
[2] https://cwiki.apache.org/confluence/x/IJeqCw

Cheers,
Till





[jira] [Created] (FLINK-25187) Apply padding for BINARY()

2021-12-06 Thread Marios Trivyzas (Jira)
Marios Trivyzas created FLINK-25187:
---

 Summary: Apply padding for BINARY()
 Key: FLINK-25187
 URL: https://issues.apache.org/jira/browse/FLINK-25187
 Project: Flink
  Issue Type: Sub-task
Reporter: Marios Trivyzas






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25186) KafkaDynamicTableFactoryTest and UpsertKafkaDynamicTableFactoryTest fails on AZP

2021-12-06 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-25186:
-

 Summary:  KafkaDynamicTableFactoryTest and 
UpsertKafkaDynamicTableFactoryTest fails on AZP
 Key: FLINK-25186
 URL: https://issues.apache.org/jira/browse/FLINK-25186
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka, Table SQL / Ecosystem
Affects Versions: 1.15.0
Reporter: Till Rohrmann
 Fix For: 1.15.0


A lot of {{KafkaDynamicTableFactoryTest}} and 
{{UpsertKafkaDynamicTableFactoryTest}} tests fail on AZP.

{code}
Dec 06 03:00:28 [ERROR]   
UpsertKafkaDynamicTableFactoryTest.testInvalidSinkBufferFlush 
Dec 06 03:00:28 Expected: (an instance of 
org.apache.flink.table.api.ValidationException and Expected failure cause is 
)
Dec 06 03:00:28  but: Expected failure cause is 
 The throwable 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=27569=logs=ce8f3cc3-c1ea-5281-f5eb-df9ebd24947f=918e890f-5ed9-5212-a25e-962628fb4bc5=10186



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-06 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-25185:
-

 Summary: StreamFaultToleranceTestBase hangs on AZP
 Key: FLINK-25185
 URL: https://issues.apache.org/jira/browse/FLINK-25185
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing, Test Infrastructure
Affects Versions: 1.13.3
Reporter: Till Rohrmann


The {{StreamFaultToleranceTestBase}} hangs on AZP.

{code}
2021-12-06T04:24:48.1676089Z 
==
2021-12-06T04:24:48.1678883Z === WARNING: This task took already 95% of the 
available time budget of 237 minutes ===
2021-12-06T04:24:48.1679596Z 
==
2021-12-06T04:24:48.1680326Z 
==
2021-12-06T04:24:48.1680877Z The following Java processes are running (JPS)
2021-12-06T04:24:48.1681467Z 
==
2021-12-06T04:24:48.6514536Z 13701 surefirebooter17740627448580534543.jar
2021-12-06T04:24:48.6515353Z 1622 Jps
2021-12-06T04:24:48.6515795Z 780 Launcher
2021-12-06T04:24:48.6825889Z 
==
2021-12-06T04:24:48.6826565Z Printing stack trace of Java process 13701
2021-12-06T04:24:48.6827012Z 
==
2021-12-06T04:24:49.1876086Z 2021-12-06 04:24:49
2021-12-06T04:24:49.1877098Z Full thread dump OpenJDK 64-Bit Server VM 
(11.0.10+9 mixed mode):
2021-12-06T04:24:49.1877362Z 
2021-12-06T04:24:49.1877672Z Threads class SMR info:
2021-12-06T04:24:49.1878049Z _java_thread_list=0x7f254c007630, length=365, 
elements={
2021-12-06T04:24:49.1878504Z 0x7f2598028000, 0x7f2598280800, 
0x7f2598284800, 0x7f2598299000,
2021-12-06T04:24:49.1878973Z 0x7f259829b000, 0x7f259829d800, 
0x7f259829f800, 0x7f25982a1800,
2021-12-06T04:24:49.1879680Z 0x7f2598337800, 0x7f25983e3000, 
0x7f2598431000, 0x7f2528016000,
2021-12-06T04:24:49.1896613Z 0x7f2599003000, 0x7f259972e000, 
0x7f2599833800, 0x7f259984c000,
2021-12-06T04:24:49.1897558Z 0x7f259984f000, 0x7f2599851000, 
0x7f2599892000, 0x7f2599894800,
2021-12-06T04:24:49.1898075Z 0x7f2499a16000, 0x7f2485acd800, 
0x7f2485ace000, 0x7f24876bb800,
2021-12-06T04:24:49.1898562Z 0x7f2461e59000, 0x7f2499a0e800, 
0x7f2461e5e800, 0x7f2461e81000,
2021-12-06T04:24:49.1899037Z 0x7f24dc015000, 0x7f2461e86800, 
0x7f2448002000, 0x7f24dc01c000,
2021-12-06T04:24:49.1899522Z 0x7f2438001000, 0x7f2438003000, 
0x7f2438005000, 0x7f2438006800,
2021-12-06T04:24:49.1899982Z 0x7f2438008800, 0x7f2434017800, 
0x7f243401a800, 0x7f2414008800,
2021-12-06T04:24:49.1900495Z 0x7f24e8089800, 0x7f24e809, 
0x7f23e4005800, 0x7f24e8092800,
2021-12-06T04:24:49.1901163Z 0x7f24e8099000, 0x7f2414015800, 
0x7f24dc04c000, 0x7f2414018800,
2021-12-06T04:24:49.1901680Z 0x7f241402, 0x7f24dc058000, 
0x7f24dc05b000, 0x7f2414022000,
2021-12-06T04:24:49.1902283Z 0x7f24d400f000, 0x7f241402e800, 
0x7f2414031800, 0x7f2414033800,
2021-12-06T04:24:49.1902880Z 0x7f2414035000, 0x7f2414037000, 
0x7f2414038800, 0x7f241403a800,
2021-12-06T04:24:49.1903354Z 0x7f241403c000, 0x7f241403e000, 
0x7f241403f800, 0x7f2414041800,
2021-12-06T04:24:49.1903812Z 0x7f2414043000, 0x7f2414045000, 
0x7f24dc064800, 0x7f2414047000,
2021-12-06T04:24:49.1904284Z 0x7f2414048800, 0x7f241404a800, 
0x7f241404c800, 0x7f241404e000,
2021-12-06T04:24:49.1904800Z 0x7f241405, 0x7f2414051800, 
0x7f2414053800, 0x7f2414055000,
2021-12-06T04:24:49.1905455Z 0x7f2414057000, 0x7f2414059000, 
0x7f241405a800, 0x7f241405c800,
2021-12-06T04:24:49.1906098Z 0x7f241405e000, 0x7f241406, 
0x7f2414062000, 0x7f2414063800,
2021-12-06T04:24:49.1906728Z 0x7f22e400c800, 0x7f2328008000, 
0x7f2284007000, 0x7f22cc019800,
2021-12-06T04:24:49.1907396Z 0x7f21f8004000, 0x7f2304012800, 
0x7f230001b000, 0x7f223c011000,
2021-12-06T04:24:49.1908080Z 0x7f24e40c1800, 0x7f2454001000, 
0x7f24e40c3000, 0x7f2454003000,
2021-12-06T04:24:49.1908794Z 0x7f24e40c5000, 0x7f2454004800, 
0x7f2444002000, 0x7f2444002800,
2021-12-06T04:24:49.1909522Z 0x7f245808b800, 0x7f24b8032800, 
0x7f24ac021000, 0x7f24b8034800,
2021-12-06T04:24:49.1910280Z 0x7f24b8036800, 0x7f24ac032800, 
0x7f24b8052000, 0x7f24ac033800,
2021-12-06T04:24:49.1911023Z 0x7f24ac035000, 0x7f24b8067000, 
0x7f24ac036000, 0x7f241407d000,

[jira] [Created] (FLINK-25184) Introduce setDeliveryCallback to KafkaWriter

2021-12-06 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-25184:


 Summary: Introduce setDeliveryCallback to KafkaWriter
 Key: FLINK-25184
 URL: https://issues.apache.org/jira/browse/FLINK-25184
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Kafka
Reporter: Jingsong Lee
 Fix For: 1.15.0


The method can be package level visibility.

Inside the managed table, we rely on this callback to know the offset of the 
written message and use this offset to do full and incremental alignment.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25183) Optimize changelog normalize for the managed table upsert mode

2021-12-06 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-25183:


 Summary: Optimize changelog normalize for the managed table upsert 
mode
 Key: FLINK-25183
 URL: https://issues.apache.org/jira/browse/FLINK-25183
 Project: Flink
  Issue Type: Sub-task
Reporter: Jingsong Lee
 Fix For: 1.15.0


The upsert mode of managed table preserves the complete delete message and 
avoids normalization for the following downstream operators:
 * Upsert sink: Upsert sink only requires upsert inputs without UPDATE_BEFORE.
 * Join: Join for unique inputs will store records by unique key. It can work 
without  UPDATE_BEFORE.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] FLIP-197: API stability graduation process

2021-12-06 Thread Ingo Bürk
Hi Till,

from my (admittedly limited) experience with how far projects lag behind in
terms of Flink versions – yes, the combined time it would take to mature
then seems reasonable enough for a sufficient adoption, IMO.

Another reason why I think two releases as a default for the last step
makes sense: say you mature an API to PublicEvolving. Typically, there will
be issues found afterwards. Even if you address these in the very next
release cycle, a duration of one release would mean you fully mature the
API in the same release in which things are still being fixed; intuitively,
it makes sense to me that the step to Public would come after a period of
no changes needed, however.


Ingo

On Fri, Dec 3, 2021 at 4:55 PM Till Rohrmann  wrote:

> Hi Ingo, thanks for your feedback.
>
> Do you think that two release cycles per graduation step would be long
> enough or should it be longer?
>
> Cheers,
> Till
>
> On Fri, Dec 3, 2021 at 4:29 PM Ingo Bürk  wrote:
>
> > Hi Till,
> >
> > Overall I whole-heartedly agree with the proposals in this FLIP. Thank
> you
> > for starting this discussion as well! This seems like something that
> could
> > be tested quite nicely with ArchUnit as well; I'll be happy to help
> should
> > the FLIP be accepted.
> >
> > > I would propose per default a single release.
> >
> > The step from PublicEvolving to Public feels more important to me, and I
> > would personally suggest making this transition a bit longer. We have a
> bit
> > of a chicken-egg problem here, because the goal of your FLIP is,
> > ultimately, also to motivate faster adoption of new Flink versions, but
> the
> > status quo prevents that; if we mature APIs too quickly, we risk losing
> out
> > on important feedback. Therefore, I would propose starting slower here,
> and
> > rather think about shortening that cycle in the future.
> >
> >
> > Best
> > Ingo
> >
> > On Thu, Dec 2, 2021 at 3:57 PM Till Rohrmann 
> wrote:
> >
> > > Hi everyone,
> > >
> > > As promised, here is the follow-up FLIP [1] for discussing how we can
> > > ensure that newly introduced APIs are being stabilized over time. This
> > FLIP
> > > is related to FLIP-196 [2].
> > >
> > > The idea of FLIP-197 is to introduce an API graduation process that
> > forces
> > > us to increase the API stability guarantee unless there is a very good
> > > reason not to do so. So the proposal is to reverse the process from
> > opt-in
> > > (increasing the stability guarantee explicitly) to opt-out (deciding
> that
> > > an API cannot be graduated with a good reason).
> > >
> > > Since every process breaks if it is not automated, we propose a richer
> > set
> > > of API stability annotations that can capture enough information so
> that
> > we
> > > can implement a test that fails if we fail to follow the process.
> > >
> > > Looking forward to your feedback.
> > >
> > > Hopefully, we can provide our users a better experience when working
> with
> > > Flink because we offer more stable APIs and make them available faster.
> > >
> > > [1] https://cwiki.apache.org/confluence/x/J5eqCw
> > > [2] https://cwiki.apache.org/confluence/x/IJeqCw
> > >
> > > Cheers,
> > > Till
> > >
> >
>


[jira] [Created] (FLINK-25182) NoClassDefFoundError of PulsarAdminImpl by using flink-connector-pulsar:1.14 on k8s flink cluster

2021-12-06 Thread HeYe (Jira)
HeYe created FLINK-25182:


 Summary: NoClassDefFoundError of PulsarAdminImpl by using 
flink-connector-pulsar:1.14 on k8s flink cluster
 Key: FLINK-25182
 URL: https://issues.apache.org/jira/browse/FLINK-25182
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Pulsar
Affects Versions: 1.14.0
 Environment: Flink: HA cluster on k8s

Flink Mode: session

Version: Flink 1.14.0

Connector:  flink-connector-pulsar_2.11:1.14.0

Pulsar cluster: ( by StreamNative' s helm charts)

   broker version: 2.8.0.8

   bookie version: 2.7.2.8

   pulsar proxy: 2.8.0.8
Reporter: HeYe
 Attachments: image-2021-12-06-16-09-12-816.png, 
image-2021-12-06-16-09-52-042.png, image-2021-12-06-16-10-13-697.png

NoClassDefFoundError of PulsarAdminImpl by using flink-connector-pulsar:1.14 on 
k8s flink cluster

 

Flink: Session mode in HA cluster on k8s

Version: Flink 1.14.0

Connector:  flink-connector-pulsar_2.11:1.14.0

 

The connector is worked by using IntelliJ IDEA, but meets exception on dev k8s 
clusters, the exception please check screenshot

!image-2021-12-06-16-09-12-816.png!

!image-2021-12-06-16-09-52-042.png!

!image-2021-12-06-16-10-13-697.png!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25181) KafkaSourceITCase.testValueOnlyDeserializer fails on AZP

2021-12-06 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-25181:
-

 Summary: KafkaSourceITCase.testValueOnlyDeserializer fails on AZP
 Key: FLINK-25181
 URL: https://issues.apache.org/jira/browse/FLINK-25181
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.13.3
Reporter: Till Rohrmann
 Fix For: 1.13.4


The test case {{KafkaSourceITCase.testValueOnlyDeserializer}} fails on AZP with

{code}
Dec 06 00:53:02 [ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time 
elapsed: 62.314 s <<< FAILURE! - in 
org.apache.flink.connector.kafka.source.KafkaSourceITCase
Dec 06 00:53:02 [ERROR] 
testValueOnlyDeserializer(org.apache.flink.connector.kafka.source.KafkaSourceITCase)
  Time elapsed: 3.018 s  <<< FAILURE!
Dec 06 00:53:02 java.lang.AssertionError: expected:<660> but was:<456>
Dec 06 00:53:02 at org.junit.Assert.fail(Assert.java:88)
Dec 06 00:53:02 at org.junit.Assert.failNotEquals(Assert.java:834)
Dec 06 00:53:02 at org.junit.Assert.assertEquals(Assert.java:645)
Dec 06 00:53:02 at org.junit.Assert.assertEquals(Assert.java:631)
Dec 06 00:53:02 at 
org.apache.flink.connector.kafka.source.KafkaSourceITCase.testValueOnlyDeserializer(KafkaSourceITCase.java:173)
Dec 06 00:53:02 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
Dec 06 00:53:02 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
Dec 06 00:53:02 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Dec 06 00:53:02 at java.lang.reflect.Method.invoke(Method.java:498)
Dec 06 00:53:02 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
Dec 06 00:53:02 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
Dec 06 00:53:02 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
Dec 06 00:53:02 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
Dec 06 00:53:02 at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
Dec 06 00:53:02 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
Dec 06 00:53:02 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
Dec 06 00:53:02 at 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
Dec 06 00:53:02 at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
Dec 06 00:53:02 at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
Dec 06 00:53:02 at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
Dec 06 00:53:02 at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
Dec 06 00:53:02 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
Dec 06 00:53:02 at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
Dec 06 00:53:02 at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363)
Dec 06 00:53:02 at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
Dec 06 00:53:02 at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
Dec 06 00:53:02 at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
Dec 06 00:53:02 at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
Dec 06 00:53:02 at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
Dec 06 00:53:02 at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
Dec 06 00:53:02 at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
Dec 06 00:53:02 at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
{code}

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=27568=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5=6558



--
This message was sent by Atlassian Jira
(v8.20.1#820001)