date:20220518

Re: [VOTE] Creating an Apache Flink slack workspace

2022-05-18 Thread Peter Huang

+1 (non-binding)


Best Regards
Peter Huang

On Wed, May 18, 2022 at 9:33 PM Leonard Xu  wrote:

> Thanks Xintong for driving this.
>
>  +1
>
> Best,
> Leonard
>
> > 2022年5月19日 上午11:11，Zhou, Brian  写道：
> >
> > +1 (non-binding)  Slack is a better place for code sharing and quick
> discussion.
> >
> > Regards,
> > Brian Zhou
> >
> > -Original Message-
> > From: Yun Tang 
> > Sent: Thursday, May 19, 2022 10:32 AM
> > To: dev
> > Subject: Re: [VOTE] Creating an Apache Flink slack workspace
> >
> >
> > [EXTERNAL EMAIL]
> >
> > Thanks Xintong for driving this. +1 from my side.
> >
> >
> > Best
> > Yun Tang
> > 
> > From: Zhu Zhu 
> > Sent: Wednesday, May 18, 2022 17:08
> > To: dev 
> > Subject: Re: [VOTE] Creating an Apache Flink slack workspace
> >
> > +1 (binding)
> >
> > Thanks,
> > Zhu
> >
> > Timo Walther  于2022年5月18日周三 16:52写道：
> >>
> >> +1 (binding)
> >>
> >> Thanks,
> >> Timo
> >>
> >>
> >> On 17.05.22 20:44, Gyula Fóra wrote:
> >>> +1 (binding)
> >>>
> >>> On Tue, 17 May 2022 at 19:52, Yufei Zhang  wrote:
> >>>
>  +1 (nonbinding)
> 
>  On Tue, May 17, 2022 at 5:29 PM Márton Balassi <
> balassi.mar...@gmail.com>
>  wrote:
> 
> > +1 (binding)
> >
> > On Tue, May 17, 2022 at 11:00 AM Jingsong Li  >
> > wrote:
> >
> >> Thank Xintong for driving this work.
> >>
> >> +1
> >>
> >> Best,
> >> Jingsong
> >>
> >> On Tue, May 17, 2022 at 4:49 PM Martijn Visser <
>  martijnvis...@apache.org
> >>
> >> wrote:
> >>
> >>> +1 (binding)
> >>>
> >>> On Tue, 17 May 2022 at 10:38, Yu Li  wrote:
> >>>
>  +1 (binding)
> 
>  Thanks Xintong for driving this!
> 
>  Best Regards,
>  Yu
> 
> 
>  On Tue, 17 May 2022 at 16:32, Robert Metzger  >
> >>> wrote:
> 
> > Thanks for starting the VOTE!
> >
> > +1 (binding)
> >
> >
> >
> > On Tue, May 17, 2022 at 10:29 AM Jark Wu 
>  wrote:
> >
> >> Thank Xintong for driving this work.
> >>
> >> +1 from my side (binding)
> >>
> >> Best,
> >> Jark
> >>
> >> On Tue, 17 May 2022 at 16:24, Xintong Song <
> > tonysong...@gmail.com>
> > wrote:
> >>
> >>> Hi everyone,
> >>>
> >>> As previously discussed in [1], I would like to open a vote
>  on
>  creating
> >> an
> >>> Apache Flink slack workspace channel.
> >>>
> >>> The proposed actions include:
> >>> - Creating a dedicated slack workspace with the name Apache
> > Flink
>  that
> > is
> >>> controlled and maintained by the Apache Flink PMC
> >>> - Updating the Flink website about rules for using various
> > communication
> >>> channels
> >>> - Setting up an Archive for the Apache Flink slack
> >>> - Revisiting this initiative by the end of 2022
> >>>
> >>> The vote will last for at least 72 hours, and will be
>  accepted
> >> by a
> >>> consensus of active PMC members.
> >>>
> >>> Best,
> >>>
> >>> Xintong
> >>>
> >>
> >
> 
> >>>
> >>
> >
> 
> >>>
> >>
>
>

Re: [VOTE] Creating an Apache Flink slack workspace

2022-05-18 Thread Leonard Xu

Thanks Xintong for driving this.

 +1

Best,
Leonard

> 2022年5月19日 上午11:11，Zhou, Brian  写道：
> 
> +1 (non-binding)  Slack is a better place for code sharing and quick 
> discussion.
> 
> Regards,
> Brian Zhou
> 
> -Original Message-
> From: Yun Tang  
> Sent: Thursday, May 19, 2022 10:32 AM
> To: dev
> Subject: Re: [VOTE] Creating an Apache Flink slack workspace
> 
> 
> [EXTERNAL EMAIL] 
> 
> Thanks Xintong for driving this. +1 from my side.
> 
> 
> Best
> Yun Tang
> 
> From: Zhu Zhu 
> Sent: Wednesday, May 18, 2022 17:08
> To: dev 
> Subject: Re: [VOTE] Creating an Apache Flink slack workspace
> 
> +1 (binding)
> 
> Thanks,
> Zhu
> 
> Timo Walther  于2022年5月18日周三 16:52写道：
>> 
>> +1 (binding)
>> 
>> Thanks,
>> Timo
>> 
>> 
>> On 17.05.22 20:44, Gyula Fóra wrote:
>>> +1 (binding)
>>> 
>>> On Tue, 17 May 2022 at 19:52, Yufei Zhang  wrote:
>>> 
 +1 (nonbinding)
 
 On Tue, May 17, 2022 at 5:29 PM Márton Balassi 
 wrote:
 
> +1 (binding)
> 
> On Tue, May 17, 2022 at 11:00 AM Jingsong Li 
> wrote:
> 
>> Thank Xintong for driving this work.
>> 
>> +1
>> 
>> Best,
>> Jingsong
>> 
>> On Tue, May 17, 2022 at 4:49 PM Martijn Visser <
 martijnvis...@apache.org
>> 
>> wrote:
>> 
>>> +1 (binding)
>>> 
>>> On Tue, 17 May 2022 at 10:38, Yu Li  wrote:
>>> 
 +1 (binding)
 
 Thanks Xintong for driving this!
 
 Best Regards,
 Yu
 
 
 On Tue, 17 May 2022 at 16:32, Robert Metzger 
>>> wrote:
 
> Thanks for starting the VOTE!
> 
> +1 (binding)
> 
> 
> 
> On Tue, May 17, 2022 at 10:29 AM Jark Wu 
 wrote:
> 
>> Thank Xintong for driving this work.
>> 
>> +1 from my side (binding)
>> 
>> Best,
>> Jark
>> 
>> On Tue, 17 May 2022 at 16:24, Xintong Song <
> tonysong...@gmail.com>
> wrote:
>> 
>>> Hi everyone,
>>> 
>>> As previously discussed in [1], I would like to open a vote
 on
 creating
>> an
>>> Apache Flink slack workspace channel.
>>> 
>>> The proposed actions include:
>>> - Creating a dedicated slack workspace with the name Apache
> Flink
 that
> is
>>> controlled and maintained by the Apache Flink PMC
>>> - Updating the Flink website about rules for using various
> communication
>>> channels
>>> - Setting up an Archive for the Apache Flink slack
>>> - Revisiting this initiative by the end of 2022
>>> 
>>> The vote will last for at least 72 hours, and will be
 accepted
>> by a
>>> consensus of active PMC members.
>>> 
>>> Best,
>>> 
>>> Xintong
>>> 
>> 
> 
 
>>> 
>> 
> 
 
>>> 
>>

Re: [DISCUSS] Next Flink Kubernetes Operator release timeline

2022-05-18 Thread Gyula Fóra

Hi Thomas!

Thank you for raising your concerns.

I agree that we should document the compatibility guarantees that we expect
to provide going forward.

Since releasing 0.1 (v1alpha1) we added a great deal of new core features.
This required some modification to the CR obviously but actually it only
touched the status subresource and the mainly user facing spec itself had
only backward compatible changes. In the future we also would like to start
moving fields out from the status to some configmaps to make it easier to
change logic in the future (this can be done in a backward compatible way).

Based on these I think it is fair to say that we expect to keep backward
compatibility going forward for the CR itself and I think release version
1.0.0 (with api version v1beta1) shows our confidence in the overall spec
and design. With the core features covered I would consider this production
ready and 1.0.0 marks it so, based on the wider experience we gain from
users we will can further improve the design towards the v1 api release (in
a backward compatible way :))

As for the upgrade docs you linked, it explains the process of upgrading
from the currently experimental v1alpha1 to the new v1beta1 release. For
this release this is the relevant process, but certainly we need to upgrade
before the next release. Also you are right that the automation is not
there, that again is definitely a blocker for the next release to ensure
backward compatibility. We have tickets already for these 2 tasks. [1][2]

Cheers
Gyula

[1] https://issues.apache.org/jira/browse/FLINK-26955
[2] https://issues.apache.org/jira/browse/FLINK-27302

On Thu, May 19, 2022 at 2:26 AM Thomas Weise  wrote:

> I think before we release 1.0, we need to define and document the
> compatibility guarantees.
>
> At the moment, the CR changes  frequently and as was pointed out
> earlier, there isn't any automation to ensure changes are backward
> compatible. In addition, our documentation still refers to upgrade as
> a process that involves removing the prior CRD, which IMO needs to
> change for a 1.0 release.
>
> If we feel that we are not ready to put a compatibility guarantee in
> place, then perhaps release the next version as 0.2?
>
> Thanks,
> Thomas
>
>
> [1]
> https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/upgrade/
>
> On Mon, May 16, 2022 at 5:13 PM Aitozi  wrote:
> >
> > Thanks Gyula. It looks good to me. I could do a favor during the release
> > also.
> > Please feel free to ping me to help the doc, release and test work :)
> >
> > Best,
> > Aitozi
> >
> > Yang Wang  于2022年5月16日周一 21:57写道：
> >
> > > Thanks Gyula for sharing the progress. It is very likely we could have
> the
> > > first release candidate next Monday.
> > >
> > > Best,
> > > Yang
> > >
> > > Gyula Fóra  于2022年5月16日周一 20:50写道：
> > >
> > > > Hi Devs!
> > > >
> > > > We are on track for our planned 1.0.0 release timeline. There are no
> > > > outstanding blocker issues on JIRA for the release.
> > > >
> > > > There are 3 outstanding new feature PRs. They are all in pretty good
> > > shape
> > > > and should be merged within a day:
> > > > https://github.com/apache/flink-kubernetes-operator/pull/213
> > > > https://github.com/apache/flink-kubernetes-operator/pull/216
> > > > https://github.com/apache/flink-kubernetes-operator/pull/217
> > > >
> > > > As we agreed previously we should not merge any more new features for
> > > > 1.0.0 and focus our efforts on testing, bug fixes and documentation
> for
> > > > this week.
> > > >
> > > > I will cut the release branch tomorrow once these PRs are merged.
> And the
> > > > target day for the first release candidate is next Monday.
> > > >
> > > > The release managers for this release will be Yang Wang and myself.
> > > >
> > > > Cheers,
> > > > Gyula
> > > >
> > > > On Wed, Apr 27, 2022 at 11:28 AM Yang Wang 
> > > wrote:
> > > >
> > > >> Thanks @Chesnay Schepler  for pointing out
> this.
> > > >>
> > > >> The only public interface the flink-kubernetes-operator provides is
> the
> > > >> CRD[1]. We are trying to stabilize the CRD from v1beta1.
> > > >> If more fields are introduced to support new features(e.g.
> standalone
> > > >> mode,
> > > >> SQL jobs), they should have the default value to ensure
> compatibility.
> > > >> Currently, we do not have some tools to enforce the compatibility
> > > >> guarantees. But we have created a ticket[1] to follow this and hope
> it
> > > >> could be resolved before releasing 1.0.0.
> > > >>
> > > >> Just as you said, now is also a good time to think more about the
> > > approach
> > > >> of releases. Since flink-kubernetes-operator is much simpler than
> Flink,
> > > >> we
> > > >> could have a shorter release cycle.
> > > >> Two month for a major release(1.0, 1.1, etc.) is reasonable to me.
> And
> > > >> this
> > > >> could be shorten for the minor releases. Also we need to support at
> > > least
> > > >> the last two major versions.
> > > >>
> > > >>

[jira] [Created] (FLINK-27690) Add Pulsar Source connector document

2022-05-18 Thread LuNng Wang (Jira)

LuNng Wang created FLINK-27690:
--

 Summary: Add Pulsar Source connector document
 Key: FLINK-27690
 URL: https://issues.apache.org/jira/browse/FLINK-27690
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Affects Versions: 1.15.0
Reporter: LuNng Wang






--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27689) Pulsar Connector support PulsarSchema

2022-05-18 Thread LuNng Wang (Jira)

LuNng Wang created FLINK-27689:
--

 Summary: Pulsar Connector support PulsarSchema
 Key: FLINK-27689
 URL: https://issues.apache.org/jira/browse/FLINK-27689
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Affects Versions: 1.15.0
Reporter: LuNng Wang


Currently, Python Pulsar Connector only supports Flink Schema, we also need to 
support Pulsar Schema.

The following is detail.

https://github.com/apache/flink/pull/19682#discussion_r872131355



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

RE: [VOTE] Creating an Apache Flink slack workspace

2022-05-18 Thread Zhou, Brian

+1 (non-binding)  Slack is a better place for code sharing and quick discussion.

Regards,
Brian Zhou

-Original Message-
From: Yun Tang  
Sent: Thursday, May 19, 2022 10:32 AM
To: dev
Subject: Re: [VOTE] Creating an Apache Flink slack workspace


[EXTERNAL EMAIL] 

Thanks Xintong for driving this. +1 from my side.


Best
Yun Tang

From: Zhu Zhu 
Sent: Wednesday, May 18, 2022 17:08
To: dev 
Subject: Re: [VOTE] Creating an Apache Flink slack workspace

+1 (binding)

Thanks,
Zhu

Timo Walther  于2022年5月18日周三 16:52写道：
>
> +1 (binding)
>
> Thanks,
> Timo
>
>
> On 17.05.22 20:44, Gyula Fóra wrote:
> > +1 (binding)
> >
> > On Tue, 17 May 2022 at 19:52, Yufei Zhang  wrote:
> >
> >> +1 (nonbinding)
> >>
> >> On Tue, May 17, 2022 at 5:29 PM Márton Balassi 
> >> wrote:
> >>
> >>> +1 (binding)
> >>>
> >>> On Tue, May 17, 2022 at 11:00 AM Jingsong Li 
> >>> wrote:
> >>>
>  Thank Xintong for driving this work.
> 
>  +1
> 
>  Best,
>  Jingsong
> 
>  On Tue, May 17, 2022 at 4:49 PM Martijn Visser <
> >> martijnvis...@apache.org
> 
>  wrote:
> 
> > +1 (binding)
> >
> > On Tue, 17 May 2022 at 10:38, Yu Li  wrote:
> >
> >> +1 (binding)
> >>
> >> Thanks Xintong for driving this!
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Tue, 17 May 2022 at 16:32, Robert Metzger 
> > wrote:
> >>
> >>> Thanks for starting the VOTE!
> >>>
> >>> +1 (binding)
> >>>
> >>>
> >>>
> >>> On Tue, May 17, 2022 at 10:29 AM Jark Wu 
> >> wrote:
> >>>
>  Thank Xintong for driving this work.
> 
>  +1 from my side (binding)
> 
>  Best,
>  Jark
> 
>  On Tue, 17 May 2022 at 16:24, Xintong Song <
> >>> tonysong...@gmail.com>
> >>> wrote:
> 
> > Hi everyone,
> >
> > As previously discussed in [1], I would like to open a vote
> >> on
> >> creating
>  an
> > Apache Flink slack workspace channel.
> >
> > The proposed actions include:
> > - Creating a dedicated slack workspace with the name Apache
> >>> Flink
> >> that
> >>> is
> > controlled and maintained by the Apache Flink PMC
> > - Updating the Flink website about rules for using various
> >>> communication
> > channels
> > - Setting up an Archive for the Apache Flink slack
> > - Revisiting this initiative by the end of 2022
> >
> > The vote will last for at least 72 hours, and will be
> >> accepted
>  by a
> > consensus of active PMC members.
> >
> > Best,
> >
> > Xintong
> >
> 
> >>>
> >>
> >
> 
> >>>
> >>
> >
>

Re: [VOTE] FLIP-229: Introduces Join Hint for Flink SQL Batch Job

2022-05-18 Thread Yun Tang

Thanks for driving, +1 (binding)

Best
Yun Tang

From: Jark Wu 
Sent: Wednesday, May 18, 2022 23:09
To: dev 
Subject: Re: [VOTE] FLIP-229: Introduces Join Hint for Flink SQL Batch Job

+1(binding)

Best,
Jark

On Wed, 18 May 2022 at 14:18, Jingsong Li  wrote:

> +1 Thanks for driving.
>
> Best,
> Jingsong
>
> On Wed, May 18, 2022 at 1:33 PM godfrey he  wrote:
>
> > Thanks Xuyang for driving this, +1(binding)
> >
> > Best,
> > Godfrey
> >
> > Xuyang  于2022年5月17日周二 10:21写道：
> > >
> > > Hi, everyone.
> > > Thanks for your feedback for FLIP-229: Introduces Join Hint for Flink
> > SQL Batch Job[1] on the discussion thread[2].
> > > I'd like to start a vote for it. The vote will be open for at least 72
> > hours unless there is an objection or not enough votes.
> > >
> > > --
> > >
> > > Best！
> > > Xuyang
> > >
> > >
> > > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-229%3A+Introduces+Join+Hint+for+Flink+SQL+Batch+Job
> > > [2] https://lists.apache.org/thread/y668bxyjz66ggtjypfz9t571m0tyvv9h
> >
>

Re: [VOTE] Creating an Apache Flink slack workspace

2022-05-18 Thread Yun Tang

Thanks Xintong for driving this. +1 from my side.


Best
Yun Tang

From: Zhu Zhu 
Sent: Wednesday, May 18, 2022 17:08
To: dev 
Subject: Re: [VOTE] Creating an Apache Flink slack workspace

+1 (binding)

Thanks,
Zhu

Timo Walther  于2022年5月18日周三 16:52写道：
>
> +1 (binding)
>
> Thanks,
> Timo
>
>
> On 17.05.22 20:44, Gyula Fóra wrote:
> > +1 (binding)
> >
> > On Tue, 17 May 2022 at 19:52, Yufei Zhang  wrote:
> >
> >> +1 (nonbinding)
> >>
> >> On Tue, May 17, 2022 at 5:29 PM Márton Balassi 
> >> wrote:
> >>
> >>> +1 (binding)
> >>>
> >>> On Tue, May 17, 2022 at 11:00 AM Jingsong Li 
> >>> wrote:
> >>>
>  Thank Xintong for driving this work.
> 
>  +1
> 
>  Best,
>  Jingsong
> 
>  On Tue, May 17, 2022 at 4:49 PM Martijn Visser <
> >> martijnvis...@apache.org
> 
>  wrote:
> 
> > +1 (binding)
> >
> > On Tue, 17 May 2022 at 10:38, Yu Li  wrote:
> >
> >> +1 (binding)
> >>
> >> Thanks Xintong for driving this!
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Tue, 17 May 2022 at 16:32, Robert Metzger 
> > wrote:
> >>
> >>> Thanks for starting the VOTE!
> >>>
> >>> +1 (binding)
> >>>
> >>>
> >>>
> >>> On Tue, May 17, 2022 at 10:29 AM Jark Wu 
> >> wrote:
> >>>
>  Thank Xintong for driving this work.
> 
>  +1 from my side (binding)
> 
>  Best,
>  Jark
> 
>  On Tue, 17 May 2022 at 16:24, Xintong Song <
> >>> tonysong...@gmail.com>
> >>> wrote:
> 
> > Hi everyone,
> >
> > As previously discussed in [1], I would like to open a vote
> >> on
> >> creating
>  an
> > Apache Flink slack workspace channel.
> >
> > The proposed actions include:
> > - Creating a dedicated slack workspace with the name Apache
> >>> Flink
> >> that
> >>> is
> > controlled and maintained by the Apache Flink PMC
> > - Updating the Flink website about rules for using various
> >>> communication
> > channels
> > - Setting up an Archive for the Apache Flink slack
> > - Revisiting this initiative by the end of 2022
> >
> > The vote will last for at least 72 hours, and will be
> >> accepted
>  by a
> > consensus of active PMC members.
> >
> > Best,
> >
> > Xintong
> >
> 
> >>>
> >>
> >
> 
> >>>
> >>
> >
>

Re:[ANNOUNCE] Call for Presentations for ApacheCon Asia 2022 streaming track

2022-05-18 Thread hamster

退订
At 2022-05-18 19:47:47, "Yu Li"  wrote:
>Hi everyone,
>
>ApacheCon Asia [1] will feature the Streaming track for the second year.
>Please don't hesitate to submit your proposal if there is an interesting
>project or Flink experience you would like to share with us!
>
>The conference will be online (virtual) and the talks will be pre-recorded.
>The deadline of proposal submission is at the end of this month (May 31st).
>
>See you all there :)
>
>Best Regards,
>Yu
>
>[1] https://apachecon.com/acasia2022/cfp.html

Re: [DISCUSS] Next Flink Kubernetes Operator release timeline

2022-05-18 Thread Thomas Weise

I think before we release 1.0, we need to define and document the
compatibility guarantees.

At the moment, the CR changes  frequently and as was pointed out
earlier, there isn't any automation to ensure changes are backward
compatible. In addition, our documentation still refers to upgrade as
a process that involves removing the prior CRD, which IMO needs to
change for a 1.0 release.

If we feel that we are not ready to put a compatibility guarantee in
place, then perhaps release the next version as 0.2?

Thanks,
Thomas


[1] 
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/upgrade/

On Mon, May 16, 2022 at 5:13 PM Aitozi  wrote:
>
> Thanks Gyula. It looks good to me. I could do a favor during the release
> also.
> Please feel free to ping me to help the doc, release and test work :)
>
> Best,
> Aitozi
>
> Yang Wang  于2022年5月16日周一 21:57写道：
>
> > Thanks Gyula for sharing the progress. It is very likely we could have the
> > first release candidate next Monday.
> >
> > Best,
> > Yang
> >
> > Gyula Fóra  于2022年5月16日周一 20:50写道：
> >
> > > Hi Devs!
> > >
> > > We are on track for our planned 1.0.0 release timeline. There are no
> > > outstanding blocker issues on JIRA for the release.
> > >
> > > There are 3 outstanding new feature PRs. They are all in pretty good
> > shape
> > > and should be merged within a day:
> > > https://github.com/apache/flink-kubernetes-operator/pull/213
> > > https://github.com/apache/flink-kubernetes-operator/pull/216
> > > https://github.com/apache/flink-kubernetes-operator/pull/217
> > >
> > > As we agreed previously we should not merge any more new features for
> > > 1.0.0 and focus our efforts on testing, bug fixes and documentation for
> > > this week.
> > >
> > > I will cut the release branch tomorrow once these PRs are merged. And the
> > > target day for the first release candidate is next Monday.
> > >
> > > The release managers for this release will be Yang Wang and myself.
> > >
> > > Cheers,
> > > Gyula
> > >
> > > On Wed, Apr 27, 2022 at 11:28 AM Yang Wang 
> > wrote:
> > >
> > >> Thanks @Chesnay Schepler  for pointing out this.
> > >>
> > >> The only public interface the flink-kubernetes-operator provides is the
> > >> CRD[1]. We are trying to stabilize the CRD from v1beta1.
> > >> If more fields are introduced to support new features(e.g. standalone
> > >> mode,
> > >> SQL jobs), they should have the default value to ensure compatibility.
> > >> Currently, we do not have some tools to enforce the compatibility
> > >> guarantees. But we have created a ticket[1] to follow this and hope it
> > >> could be resolved before releasing 1.0.0.
> > >>
> > >> Just as you said, now is also a good time to think more about the
> > approach
> > >> of releases. Since flink-kubernetes-operator is much simpler than Flink,
> > >> we
> > >> could have a shorter release cycle.
> > >> Two month for a major release(1.0, 1.1, etc.) is reasonable to me. And
> > >> this
> > >> could be shorten for the minor releases. Also we need to support at
> > least
> > >> the last two major versions.
> > >>
> > >> Maybe the standalone mode support is a big enough feature for version
> > 2.0.
> > >>
> > >>
> > >> [1].
> > >>
> > >>
> > https://github.com/apache/flink-kubernetes-operator/tree/main/helm/flink-kubernetes-operator/crds
> > >> [2]. https://issues.apache.org/jira/browse/FLINK-26955
> > >>
> > >>
> > >> @Hao t Chang  We do not have regular sync up
> > meeting
> > >> so
> > >> far. But I think we could schedule some sync up for the 1.0.0 release if
> > >> necessary. Anyone who is interested are welcome.
> > >>
> > >>
> > >> Best,
> > >> Yang
> > >>
> > >>
> > >>
> > >>
> > >> Hao t Chang  于2022年4月27日周三 07:45写道：
> > >>
> > >> > Hi Gyula,
> > >> >
> > >> > Thanks for the release timeline information. I would like to learn the
> > >> > gathered knowledge and volunteer as well. Will there be sync up
> > >> > meeting/call for this collaboration ?
> > >> >
> > >> > From: Gyula Fóra 
> > >> > Date: Monday, April 25, 2022 at 11:22 AM
> > >> > To: dev 
> > >> > Subject: [DISCUSS] Next Flink Kubernetes Operator release timeline
> > >> > Hi Devs!
> > >> >
> > >> > The community has been working hard on cleaning up the operator logic
> > >> and
> > >> > adding some core features that have been missing from the preview
> > >> release
> > >> > (session jobs for example). We have also added some significant
> > >> > improvements around deployment/operations.
> > >> >
> > >> > With the current pace of the development I think in a few weeks we
> > >> should
> > >> > be in a good position to release next version of the operator. This
> > >> would
> > >> > also give us the opportunity to add support for the upcoming 1.15
> > >> release
> > >> > :)
> > >> >
> > >> > We have to decide on 2 main things:
> > >> >  1. Target release date
> > >> >  2. Release version
> > >> >
> > >> > With the current state of the project I am confident that we could
> > cut a
> > >> > really good

[jira] [Created] (FLINK-27688) Pluggable backend for EventUtils

2022-05-18 Thread Jira

Márton Balassi created FLINK-27688:
--

 Summary: Pluggable backend for EventUtils
 Key: FLINK-27688
 URL: https://issues.apache.org/jira/browse/FLINK-27688
 Project: Flink
  Issue Type: New Feature
  Components: Kubernetes Operator
Reporter: Márton Balassi
Assignee: Gyula Fora
 Fix For: kubernetes-operator-1.1.0


Currently the 
[EventUtils|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/utils/EventUtils.java]
 utility that we use to publish events for the operator has an implementation 
that is tightly coupled with the [Kubernetes 
Events|https://www.containiq.com/post/kubernetes-events] mechanism.

I suggest to enhance this with a pluggable event interface, which could be 
implemented by our users to support their event messaging system of choice. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

RE: Kafka Sink Key and Value Avro Schema Usage Issues

2022-05-18 Thread Ghiya, Jay (GE Healthcare)

Also forgot to attach the information regarding how did I convert the avro 
objects to bytes in the approach that I took with deprecated kafka producer.

protected byte[] getValueBytes(Value value)
{
DatumWriter valWriter = new SpecificDatumWriter(
Value.getSchema());
ByteArrayOutputStream valOut = new ByteArrayOutputStream();
BinaryEncoder valEncoder = EncoderFactory.get().binaryEncoder(valOut, 
null);

try {
valWriter.write(value, valEncoder);

// TODO Auto-generated catch block

valEncoder.flush();

// TODO Auto-generated catch block

valOut.close();

// TODO Auto-generated catch block

} catch (Exception e) {

}

return valOut.toByteArray();
}

protected byte[] getKeyBytes(Key key) {

DatumWriter keyWriter = new SpecificDatumWriter(
key.getSchema());
ByteArrayOutputStream keyOut = new ByteArrayOutputStream();
BinaryEncoder keyEncoder = EncoderFactory.get().binaryEncoder(keyOut, 
null);

try {
keyWriter.write(key, keyEncoder);

// TODO Auto-generated catch block

keyEncoder.flush();

// TODO Auto-generated catch block

keyOut.close();

// TODO Auto-generated catch block

} catch (Exception e) {

}

return keyOut.toByteArray();
}



From: Ghiya, Jay (GE Healthcare)
Sent: 18 May 2022 21:51
To: u...@flink.apache.org
Cc: dev@flink.apache.org; Pandiaraj, Satheesh kumar (GE Healthcare) 
; Kumar, Vipin (GE Healthcare) 

Subject: Kafka Sink Key and Value Avro Schema Usage Issues

Hi Team,

This is regarding Flink Kafka Sink. We have a usecase where we have headers and 
both key and value as Avro Schema.

Below is the expectation in terms of intuitiveness for avro kafka key and value:

KafkaSink.>builder()
.setBootstrapServers(cloudkafkaBrokerAPI)
.setRecordSerializer(
KafkaRecordSerializationSchema.builder()
.setKeySerializationSchema(
ConfluentRegistryAvroSerializationSchema
.forSpecific(
key.class,
"Key",
cloudSchemaRegistryURL))
.setValueSerializationSchema(

ConfluentRegistryAvroSerializationSchema
.forSpecific(
Value.class,"val", 
cloudSchemaRegistryURL))
.setTopic(outputTopic)
.build())
.build();

What I understood currently it does not accept key and value both as avro 
schemas as part of kafka sink. It only accepts sink.

First I tried to use the deprecated Flink Kafka Producer by implementing 
KafkaSerializationSchema and supplying properties of avro ser and der via :
cloudKafkaProperties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,io.confluent.kafka.serializers.KafkaAvroSerializer.class.getName());
cloudKafkaProperties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,io.confluent.kafka.serializers.KafkaAvroSerializer.class.getName());


The problem here is I am able to run this example but the schema that gets 
stored in confluent schema registry is:
{
"subject": "ddp_out-key",
"version": 1,
"id": 1,
"schema": "\"bytes\""
}

Instead of full schema it has just recognized the whole as bytes. So I am 
looking for a solution without kafka sink to make it work as of now and is 
there feature request part of roadmap for adding support
To kafka sink itself for producer record as that would be ideal. The previous 
operator can send the producer record with key,val and headers and then it can 
be forwarded ahead.

-Jay
GEHC

[jira] [Created] (FLINK-27687) SpanningWrapper shouldn't assume temp folder exists

2022-05-18 Thread Jira

Gaël Renoux created FLINK-27687:
---

 Summary: SpanningWrapper shouldn't assume temp folder exists
 Key: FLINK-27687
 URL: https://issues.apache.org/jira/browse/FLINK-27687
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Network
Affects Versions: 1.14.4
Reporter: Gaël Renoux


In SpanningWrapper.createSpillingChannel, it assumes that the folder in which 
we create the file exists. However, this is not the case in the following 
scenario (which actually happened to us today):
 * The temp folders were created a while ago (I assume on startup of the 
task-manager) in the /tmp folder. They weren't used for a while, probably 
because we didn't have any record big enough to trigger it.
 * The cleanup cron for /tmp did its job and deleted those old folders in /tmp.
 * We deployed a new version of the job that actually needed the folders, and 
it crashed.

=> Not sure if it should be SpanningWrapper's responsability to create the 
folder if it doesn't exist anymore, though, but I'm not familiar enough with 
Flink's internal to make a guess as to what class should do it. The problem 
occurred to us on SpanningWrapper, but it can probably happen in other places 
as well.

More generally, assuming that folders and files in /tmp won't get deleted at 
some point doesn't seem correct to me. The [documentation for 
io.tmp.dirs|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/]
 recommands that it shouldn't be purged, but we do need to clean up at some 
point. If that is not the case, then the documentation should be updated to 
indicate that this is not a recommendation but mandatory, and that purges will 
break the jobs (not just trigger a recovery).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Kafka Sink Key and Value Avro Schema Usage Issues

2022-05-18 Thread Ghiya, Jay (GE Healthcare)

Hi Team,

This is regarding Flink Kafka Sink. We have a usecase where we have headers and 
both key and value as Avro Schema.

Below is the expectation in terms of intuitiveness for avro kafka key and value:

KafkaSink.>builder()
.setBootstrapServers(cloudkafkaBrokerAPI)
.setRecordSerializer(
KafkaRecordSerializationSchema.builder()
.setKeySerializationSchema(
ConfluentRegistryAvroSerializationSchema
.forSpecific(
key.class,
"Key",
cloudSchemaRegistryURL))
.setValueSerializationSchema(

ConfluentRegistryAvroSerializationSchema
.forSpecific(
Value.class,"val", 
cloudSchemaRegistryURL))
.setTopic(outputTopic)
.build())
.build();

What I understood currently it does not accept key and value both as avro 
schemas as part of kafka sink. It only accepts sink.

First I tried to use the deprecated Flink Kafka Producer by implementing 
KafkaSerializationSchema and supplying properties of avro ser and der via :
cloudKafkaProperties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,io.confluent.kafka.serializers.KafkaAvroSerializer.class.getName());
cloudKafkaProperties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,io.confluent.kafka.serializers.KafkaAvroSerializer.class.getName());


The problem here is I am able to run this example but the schema that gets 
stored in confluent schema registry is:
{
"subject": "ddp_out-key",
"version": 1,
"id": 1,
"schema": "\"bytes\""
}

Instead of full schema it has just recognized the whole as bytes. So I am 
looking for a solution without kafka sink to make it work as of now and is 
there feature request part of roadmap for adding support
To kafka sink itself for producer record as that would be ideal. The previous 
operator can send the producer record with key,val and headers and then it can 
be forwarded ahead.

-Jay
GEHC

[jira] [Created] (FLINK-27686) Only patch status when the status actually changed

2022-05-18 Thread Gyula Fora (Jira)

Gyula Fora created FLINK-27686:
--

 Summary: Only patch status when the status actually changed
 Key: FLINK-27686
 URL: https://issues.apache.org/jira/browse/FLINK-27686
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Gyula Fora
 Fix For: kubernetes-operator-1.0.0


The StatusHelper class currently always patches the status regardless if it 
changed or not.

We should use an ObjectMapper and simply compare the ObjectNode representations 
and only patch if there is any change.

 

(I think we cannot directly compare the status objects because some of the 
content comes from getters and are not part of the equals implementation)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Re: [VOTE] FLIP-229: Introduces Join Hint for Flink SQL Batch Job

2022-05-18 Thread Jark Wu

+1(binding)

Best,
Jark

On Wed, 18 May 2022 at 14:18, Jingsong Li  wrote:

> +1 Thanks for driving.
>
> Best,
> Jingsong
>
> On Wed, May 18, 2022 at 1:33 PM godfrey he  wrote:
>
> > Thanks Xuyang for driving this, +1(binding)
> >
> > Best,
> > Godfrey
> >
> > Xuyang  于2022年5月17日周二 10:21写道：
> > >
> > > Hi, everyone.
> > > Thanks for your feedback for FLIP-229: Introduces Join Hint for Flink
> > SQL Batch Job[1] on the discussion thread[2].
> > > I'd like to start a vote for it. The vote will be open for at least 72
> > hours unless there is an objection or not enough votes.
> > >
> > > --
> > >
> > > Best！
> > > Xuyang
> > >
> > >
> > > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-229%3A+Introduces+Join+Hint+for+Flink+SQL+Batch+Job
> > > [2] https://lists.apache.org/thread/y668bxyjz66ggtjypfz9t571m0tyvv9h
> >
>

Re: Re: [DISCUSS] FLIP-218: Support SELECT clause in CREATE TABLE(CTAS)

2022-05-18 Thread Jark Wu

Hi Godfrey,

Regarding Table API for CTAS, "Table#createTableAs(tablePath)" seems a
little strange to me.
Usually, the parameter after AS should be the query, but the query is in
front of AS.
I slightly prefer a method on TableEnvironment besides "createTable" (i.e.
a special createTable with writing data).

For example:
void createTableAs(String path, TableDescriptor descriptor, Table query);

Usage:
tableEnv.createTableAs(
"T1",
TableDescriptor.forConnector("hive")
.option("format", "parquet")
.build(),
query);


Best,
Jark

On Wed, 18 May 2022 at 22:53, Jark Wu  wrote:

> Hi Mang,
>
> Thanks for proposing this, CTAS is a very important API for batch users.
>
> I think the key problem of this FLIP is the ACID semantics of the CTAS
> operation.
> We care most about two parts of the semantics:
> 1) Atomicity: the created table should be rolled back if the write is
> failed.
> 2) Isolation: the created table shouldn't be visible before the write is
> successful (read uncommitted).
>
> From your investigation, it seems that:
> - Flink (your FLIP): none of them.   ==> LEVEL-1
> - Spark DataSource v1: is atomic (can roll back), but is not isolated. ==>
> LEVEL-2
> - Spark DataSource v2: guarantees both of them.  ==> LEVEL-3
> - Hive MR: guarantees both of them. ==> LEVEL-3
>
> In order to support higher ACID semantics, I agree with Godfrey that we
> need some hooks in JM
> which can be called when the job is finished or failed/canceled. It might
> look like
> `StreamExecutionEnvironment#registerJobListener(JobListener)`,
> but JobListener is called on the
> client side. What we need is an interface called on the JM side, because
> the job can be submitted in
> detached mode.
>
> With this interface, we can easily support LEVEL-2 semantics by calling
> `Catalog#dropTable` in the
> `JobListener#onJobFailed`. We can also support LEVEL-3 by introducing
> `StagingTableCatalog` like Spark,
> calling `StagedTable#commitStagedChanges()` in `JobListener#onJobFinished`
> and
> calling StagedTable#abortStagedChanges() in `JobListener#onJobFailed`.
>
> Best,
> Jark
>
>
> On Wed, 18 May 2022 at 12:29, godfrey he  wrote:
>
>> Hi Mang,
>>
>> Thanks for driving this FLIP.
>>
>> Please follow the FLIP template[1] style, and the `Syntax ` is part of
>> the `Public API Changes` section.
>> ‘Program research’ and 'Implementation Plan' are part of the `Proposed
>> Changes` section,
>> or move ‘Program research’ to the appendix.
>>
>> > Providing methods that are used to execute CTAS for Table API users.
>> We should introduce `createTable` in `Table` instead of
>> `TableEnvironment`.
>> Because all table operations are defined in `Table`, see:
>> Table#executeInsert,
>> Table#insertInto, etc.
>> About the method name, I prefer to use `createTableAs`.
>>
>> > TableSink needs to provide the CleanUp API, developers implement as
>> needed.
>> I think it's hard for TableSink to implement a clean up operation. For
>> file system sink,
>> the data can be written to a temporary directory, but for key/value
>> sinks, it's hard to
>> remove the written keys, unless the sink records all written keys.
>>
>> > Do not do drop table operations in the framework, drop table is
>> implemented in
>> TableSink according to the needs of specific TableSink
>> The TM process may crash at any time, and the drop operation will not
>> be executed any more.
>>
>> How about we do the drop table operation and cleanup data action in the
>> catalog?
>> Where to execute the drop operation. one approach is in client, other is
>> in JM.
>> 1. in client: this requires the client to be alive until the job is
>> finished and failed.
>> 2. in JM: this requires the JM could provide some interfaces/hooks
>> that the planner
>> implements the logic and the code will be executed in JM.
>> I prefer the approach two, but it requires more detail design with
>> runtime @gaoyunhaii, @kevin.yingjie
>>
>>
>> [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP+Template
>>
>> Best,
>> Godfrey
>>
>>
>> Mang Zhang  于2022年5月6日周五 11:24写道：
>>
>> >
>> > Hi, Yuxia
>> > Thanks for your reply!
>> > About the question 1, we will not support, FLIP-218[1] is to simplify
>> the complexity of user DDL and make it easier for users to use. I have
>> never encountered this case in a big data.
>> > About the question 2, we will provide a public API like below public
>> void cleanUp();
>> >
>> >   Regarding the mechanism of cleanUp, people who are familiar with
>> the runtime module need to provide professional advice, which is what we
>> need to focus on.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > --
>> >
>> > Best regards,
>> > Mang Zhang
>> >
>> >
>> >
>> >
>> >
>> > At 2022-04-29 17:00:03, "yuxia"  wrote:
>> > >Thanks for for driving this work, it's to be a useful feature.
>> > >About the flip-218, I have some questions.
>> > >
>> > >1: Does our CTAS syntax support specify

Re: Re: [DISCUSS] FLIP-218: Support SELECT clause in CREATE TABLE(CTAS)

2022-05-18 Thread Jark Wu

Hi Mang,

Thanks for proposing this, CTAS is a very important API for batch users.

I think the key problem of this FLIP is the ACID semantics of the CTAS
operation.
We care most about two parts of the semantics:
1) Atomicity: the created table should be rolled back if the write is
failed.
2) Isolation: the created table shouldn't be visible before the write is
successful (read uncommitted).

>From your investigation, it seems that:
- Flink (your FLIP): none of them.   ==> LEVEL-1
- Spark DataSource v1: is atomic (can roll back), but is not isolated. ==>
LEVEL-2
- Spark DataSource v2: guarantees both of them.  ==> LEVEL-3
- Hive MR: guarantees both of them. ==> LEVEL-3

In order to support higher ACID semantics, I agree with Godfrey that we
need some hooks in JM
which can be called when the job is finished or failed/canceled. It might
look like
`StreamExecutionEnvironment#registerJobListener(JobListener)`,
but JobListener is called on the
client side. What we need is an interface called on the JM side, because
the job can be submitted in
detached mode.

With this interface, we can easily support LEVEL-2 semantics by calling
`Catalog#dropTable` in the
`JobListener#onJobFailed`. We can also support LEVEL-3 by introducing
`StagingTableCatalog` like Spark,
calling `StagedTable#commitStagedChanges()` in `JobListener#onJobFinished`
and
calling StagedTable#abortStagedChanges() in `JobListener#onJobFailed`.

Best,
Jark


On Wed, 18 May 2022 at 12:29, godfrey he  wrote:

> Hi Mang,
>
> Thanks for driving this FLIP.
>
> Please follow the FLIP template[1] style, and the `Syntax ` is part of
> the `Public API Changes` section.
> ‘Program research’ and 'Implementation Plan' are part of the `Proposed
> Changes` section,
> or move ‘Program research’ to the appendix.
>
> > Providing methods that are used to execute CTAS for Table API users.
> We should introduce `createTable` in `Table` instead of `TableEnvironment`.
> Because all table operations are defined in `Table`, see:
> Table#executeInsert,
> Table#insertInto, etc.
> About the method name, I prefer to use `createTableAs`.
>
> > TableSink needs to provide the CleanUp API, developers implement as
> needed.
> I think it's hard for TableSink to implement a clean up operation. For
> file system sink,
> the data can be written to a temporary directory, but for key/value
> sinks, it's hard to
> remove the written keys, unless the sink records all written keys.
>
> > Do not do drop table operations in the framework, drop table is
> implemented in
> TableSink according to the needs of specific TableSink
> The TM process may crash at any time, and the drop operation will not
> be executed any more.
>
> How about we do the drop table operation and cleanup data action in the
> catalog?
> Where to execute the drop operation. one approach is in client, other is
> in JM.
> 1. in client: this requires the client to be alive until the job is
> finished and failed.
> 2. in JM: this requires the JM could provide some interfaces/hooks
> that the planner
> implements the logic and the code will be executed in JM.
> I prefer the approach two, but it requires more detail design with
> runtime @gaoyunhaii, @kevin.yingjie
>
>
> [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP+Template
>
> Best,
> Godfrey
>
>
> Mang Zhang  于2022年5月6日周五 11:24写道：
>
> >
> > Hi, Yuxia
> > Thanks for your reply!
> > About the question 1, we will not support, FLIP-218[1] is to simplify
> the complexity of user DDL and make it easier for users to use. I have
> never encountered this case in a big data.
> > About the question 2, we will provide a public API like below public
> void cleanUp();
> >
> >   Regarding the mechanism of cleanUp, people who are familiar with
> the runtime module need to provide professional advice, which is what we
> need to focus on.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > --
> >
> > Best regards,
> > Mang Zhang
> >
> >
> >
> >
> >
> > At 2022-04-29 17:00:03, "yuxia"  wrote:
> > >Thanks for for driving this work, it's to be a useful feature.
> > >About the flip-218, I have some questions.
> > >
> > >1: Does our CTAS syntax support specify target table's schema including
> column name and data type? I think it maybe a useful fature in case we want
> to change the data types in target table instead of always copy the source
> table's schema. It'll be more flexible with this feature.
> > >Btw, MySQL's "CREATE TABLE ... SELECT Statement"[1] support this
> feature.
> > >
> > >2: Seems it'll requre sink to implement an public interface to drop
> table, so what's the interface will look like?
> > >
> > >[1] https://dev.mysql.com/doc/refman/8.0/en/create-table-select.html
> > >
> > >Best regards,
> > >Yuxia
> > >
> > >- 原始邮件 -
> > >发件人: "Mang Zhang" 
> > >收件人: "dev" 
> > >发送时间: 星期四, 2022年 4 月 28日 下午 4:57:24
> > >主题: [DISCUSS] FLIP-218: Support SELECT clause in CREATE TABLE(CTAS)
> > >
> > >Hi, everyone
> > >
> > >
> > >I would like to open a discussion for support

[jira] [Created] (FLINK-27685) Add scale subresource

2022-05-18 Thread Gyula Fora (Jira)

Gyula Fora created FLINK-27685:
--

 Summary: Add scale subresource
 Key: FLINK-27685
 URL: https://issues.apache.org/jira/browse/FLINK-27685
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Gyula Fora
Assignee: Gyula Fora
 Fix For: kubernetes-operator-1.1.0


We should define a scale subresource for the deployment/sessionjob resources 
that allows us to use the `scale` command or even hook in the HPA.

I suggest to use parallelism as the "replicas".



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27684) FlinkKafkaConsumerBase could record partitions offset when GROUP_OFFSETS

2022-05-18 Thread SilkyAlex (Jira)

SilkyAlex created FLINK-27684:
-

 Summary: FlinkKafkaConsumerBase could record partitions offset 
when GROUP_OFFSETS
 Key: FLINK-27684
 URL: https://issues.apache.org/jira/browse/FLINK-27684
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Kafka
Affects Versions: 1.14.3
Reporter: SilkyAlex


when FlinkKafkaConsumerBase startupMode been set with:

EARLIEST/LATEST/TIMESTAMP/GROUP_OFFSETS

the log when startup are not record current partitions's offsets, that makes 
difficult to locate starup offsets for check something data problem.

 

we could record it for a better world.

 
{code:java}
2022-04-15 22:27:58.802 INFO  [95] 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase - Consumer 
subtask 11 will start reading the following 1 partitions from the committed 
group offsets in Kafka: [KafkaTopicPartition{topic='kafka_topic', partition=4}]
2022-04-15 22:27:58.802 INFO  [94] 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase - Consumer 
subtask 5 will start reading the following 1 partitions from the committed 
group offsets in Kafka: [KafkaTopicPartition{topic='kafka_topic', partition=14}]
2022-04-15 22:27:58.805 INFO  [92] 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase - Consumer 
subtask 3 will start reading the following 1 partitions from the committed 
group offsets in Kafka: [KafkaTopicPartition{topic='kafka_topic', partition=12, 
wish here to log offsets}] {code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[ANNOUNCE] Call for Presentations for ApacheCon Asia 2022 streaming track

2022-05-18 Thread Yu Li

Hi everyone,

ApacheCon Asia [1] will feature the Streaming track for the second year.
Please don't hesitate to submit your proposal if there is an interesting
project or Flink experience you would like to share with us!

The conference will be online (virtual) and the talks will be pre-recorded.
The deadline of proposal submission is at the end of this month (May 31st).

See you all there :)

Best Regards,
Yu

[1] https://apachecon.com/acasia2022/cfp.html

[jira] [Created] (FLINK-27683) Insert into (column1, column2) Values(.....) can't work with sql Hints

2022-05-18 Thread Xin Yang (Jira)

Xin Yang created FLINK-27683:


 Summary: Insert into (column1, column2) Values(.) can't work 
with sql Hints
 Key: FLINK-27683
 URL: https://issues.apache.org/jira/browse/FLINK-27683
 Project: Flink
  Issue Type: New Feature
Affects Versions: 1.14.0
Reporter: Xin Yang


{code:java}
INSERT INTO `tidb`.`%s`.`%s` /*+ OPTIONS('tidb.sink.update-columns'='c2, c13')  
 (c2, c13) values(1, 12.12) {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27682) [JUnit5 Migration] Migrate ComparatorTestBase to Junit5

2022-05-18 Thread Sergey Nuyanzin (Jira)

Sergey Nuyanzin created FLINK-27682:
---

 Summary: [JUnit5 Migration] Migrate ComparatorTestBase to Junit5
 Key: FLINK-27682
 URL: https://issues.apache.org/jira/browse/FLINK-27682
 Project: Flink
  Issue Type: Sub-task
  Components: Tests
Reporter: Sergey Nuyanzin


Several modules depend on it



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

2022-05-18 Thread Jark Wu

Hi Paul,

1) SHOW QUERIES
+1 to add finished time, but it would be better to call it "end_time" to
keep aligned with names in Web UI.

2) DROP QUERY
I think we shouldn't throw exceptions for batch jobs, otherwise, how to
stop batch queries?
At present, I don't think "DROP" is a suitable keyword for this statement.
>From the perspective of users, "DROP" sounds like the query should be
removed from the
list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is more
suitable and
compliant with commands of Flink CLI.

3) SHOW SAVEPOINTS
I think this statement is needed, otherwise, savepoints are lost after the
SAVEPOINT
command is executed. Savepoints can be retrieved from REST API
"/jobs/:jobid/checkpoints"
with filtering "checkpoint_type"="savepoint". It's also worth considering
providing "SHOW CHECKPOINTS"
to list all checkpoints.

4) SAVEPOINT & RELEASE SAVEPOINT
I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT statements
now.
In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are both
the same savepoint id.
However, in our syntax, the first one is query id, and the second one is
savepoint path, which is confusing and
 not consistent. When I came across SHOW SAVEPOINT, I thought maybe they
should be in the same syntax set.
For example, CREATE SAVEPOINT FOR [QUERY]  & DROP SAVEPOINT
.
That means we don't follow the majority of vendors in SAVEPOINT commands. I
would say the purpose is different in Flink.
What other's opinion on this?

Best,
Jark

[1]:
https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints

On Wed, 18 May 2022 at 14:43, Paul Lam  wrote:

> Hi Godfrey,
>
> Thanks a lot for your inputs!
>
> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs (DataStream
> or SQL) or
> clients (SQL client or CLI). Under the hook, it’s based on
> ClusterClient#listJobs, the
> same with Flink CLI. I think it’s okay to have non-SQL jobs listed in SQL
> client, because
> these jobs can be managed via SQL client too.
>
> WRT finished time, I think you’re right. Adding it to the FLIP. But I’m a
> bit afraid that the
> rows would be too long.
>
> WRT ‘DROP QUERY’,
> > What's the behavior for batch jobs and the non-running jobs?
>
>
> In general, the behavior would be aligned with Flink CLI. Triggering a
> savepoint for
> a non-running job would cause errors, and the error message would be
> printed to
> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
> streaming
> execution mode would be the same with streaming jobs. However, for batch
> jobs in
> batch execution mode, I think there would be an error, because batch
> execution
> doesn’t support checkpoints currently (please correct me if I’m wrong).
>
> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink clusterClient/
> jobClient doesn’t have such a functionality at the moment, neither do
> Flink CLI.
> Maybe we could make it a follow-up FLIP, which includes the modifications
> to
> clusterClient/jobClient and Flink CLI. WDYT?
>
> Best,
> Paul Lam
>
> > 2022年5月17日 20:34，godfrey he  写道：
> >
> > Godfrey
>
>

Re: [VOTE] Creating an Apache Flink slack workspace

2022-05-18 Thread Zhu Zhu

+1 (binding)

Thanks,
Zhu

Timo Walther  于2022年5月18日周三 16:52写道：
>
> +1 (binding)
>
> Thanks,
> Timo
>
>
> On 17.05.22 20:44, Gyula Fóra wrote:
> > +1 (binding)
> >
> > On Tue, 17 May 2022 at 19:52, Yufei Zhang  wrote:
> >
> >> +1 (nonbinding)
> >>
> >> On Tue, May 17, 2022 at 5:29 PM Márton Balassi 
> >> wrote:
> >>
> >>> +1 (binding)
> >>>
> >>> On Tue, May 17, 2022 at 11:00 AM Jingsong Li 
> >>> wrote:
> >>>
>  Thank Xintong for driving this work.
> 
>  +1
> 
>  Best,
>  Jingsong
> 
>  On Tue, May 17, 2022 at 4:49 PM Martijn Visser <
> >> martijnvis...@apache.org
> 
>  wrote:
> 
> > +1 (binding)
> >
> > On Tue, 17 May 2022 at 10:38, Yu Li  wrote:
> >
> >> +1 (binding)
> >>
> >> Thanks Xintong for driving this!
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Tue, 17 May 2022 at 16:32, Robert Metzger 
> > wrote:
> >>
> >>> Thanks for starting the VOTE!
> >>>
> >>> +1 (binding)
> >>>
> >>>
> >>>
> >>> On Tue, May 17, 2022 at 10:29 AM Jark Wu 
> >> wrote:
> >>>
>  Thank Xintong for driving this work.
> 
>  +1 from my side (binding)
> 
>  Best,
>  Jark
> 
>  On Tue, 17 May 2022 at 16:24, Xintong Song <
> >>> tonysong...@gmail.com>
> >>> wrote:
> 
> > Hi everyone,
> >
> > As previously discussed in [1], I would like to open a vote
> >> on
> >> creating
>  an
> > Apache Flink slack workspace channel.
> >
> > The proposed actions include:
> > - Creating a dedicated slack workspace with the name Apache
> >>> Flink
> >> that
> >>> is
> > controlled and maintained by the Apache Flink PMC
> > - Updating the Flink website about rules for using various
> >>> communication
> > channels
> > - Setting up an Archive for the Apache Flink slack
> > - Revisiting this initiative by the end of 2022
> >
> > The vote will last for at least 72 hours, and will be
> >> accepted
>  by a
> > consensus of active PMC members.
> >
> > Best,
> >
> > Xintong
> >
> 
> >>>
> >>
> >
> 
> >>>
> >>
> >
>

[jira] [Created] (FLINK-27681) Improve the availability of Flink when the RocksDB file is corrupted.

2022-05-18 Thread ming li (Jira)

ming li created FLINK-27681:
---

 Summary: Improve the availability of Flink when the RocksDB file 
is corrupted.
 Key: FLINK-27681
 URL: https://issues.apache.org/jira/browse/FLINK-27681
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / State Backends
Reporter: ming li


We have encountered several times when the RocksDB checksum does not match or 
the block verification fails when the job is restored. The reason for this 
situation is generally that there are some problems with the machine where the 
task is located, which causes the files uploaded to HDFS to be incorrect, but 
it has been a long time (a dozen minutes to half an hour) when we found this 
problem. I'm not sure if anyone else has had a similar problem.

Since this file is referenced by incremental checkpoints for a long time, when 
the maximum number of checkpoints reserved is exceeded, we can only use this 
file until it is no longer referenced. When the job failed, it cannot be 
recovered.

Therefore we consider:
1. Can RocksDB periodically check whether all files are correct and find the 
problem in time?
2. Can Flink automatically roll back to the previous checkpoint when there is a 
problem with the checkpoint data, because even with manual intervention, it 
just tries to recover from the existing checkpoint or discard the entire state.
3. Can we increase the maximum number of references to a file based on the 
maximum number of checkpoints reserved? When the number of references exceeds 
the maximum number of checkpoints -1, the Task side is required to upload a new 
file for this reference. Not sure if this way will ensure that the new file we 
upload will be correct.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27680) Disable PulsarSinkITCase on JDK 11

2022-05-18 Thread Martijn Visser (Jira)

Martijn Visser created FLINK-27680:
--

 Summary: Disable PulsarSinkITCase on JDK 11
 Key: FLINK-27680
 URL: https://issues.apache.org/jira/browse/FLINK-27680
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / Pulsar
Affects Versions: 1.16.0, 1.14.5, 1.15.1
Reporter: Martijn Visser
Assignee: Martijn Visser


Since Pulsar doesn't yet support Java 11, we should make sure that the Pulsar 
tests don't run when testing JDK11. This is the case already for the e2e tests, 
but not yet for the connector tests. We should disable this too. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-18 Thread Qingsheng Ren

Hi Jark and Alexander,

Thanks for your comments! I’m also OK to introduce common table options. I
prefer to introduce a new DefaultLookupCacheOptions class for holding these
option definitions because putting all options into FactoryUtil would make
it a bit ”crowded” and not well categorized.

FLIP has been updated according to suggestions above:
1. Use static “of” method for constructing RescanRuntimeProvider
considering both arguments are required.
2. Introduce new table options matching DefaultLookupCacheFactory

Best,
Qingsheng

On Wed, May 18, 2022 at 2:57 PM Jark Wu  wrote:

> Hi Alex,
>
> 1) retry logic
> I think we can extract some common retry logic into utilities, e.g.
> RetryUtils#tryTimes(times, call).
> This seems independent of this FLIP and can be reused by DataStream users.
> Maybe we can open an issue to discuss this and where to put it.
>
> 2) cache ConfigOptions
> I'm fine with defining cache config options in the framework.
> A candidate place to put is FactoryUtil which also includes
> "sink.parallelism", "format" options.
>
> Best,
> Jark
>
>
> On Wed, 18 May 2022 at 13:52, Александр Смирнов 
> wrote:
>
>> Hi Qingsheng,
>>
>> Thank you for considering my comments.
>>
>> >  there might be custom logic before making retry, such as re-establish
>> the connection
>>
>> Yes, I understand that. I meant that such logic can be placed in a
>> separate function, that can be implemented by connectors. Just moving
>> the retry logic would make connector's LookupFunction more concise +
>> avoid duplicate code. However, it's a minor change. The decision is up
>> to you.
>>
>> > We decide not to provide common DDL options and let developers to
>> define their own options as we do now per connector.
>>
>> What is the reason for that? One of the main goals of this FLIP was to
>> unify the configs, wasn't it? I understand that current cache design
>> doesn't depend on ConfigOptions, like was before. But still we can put
>> these options into the framework, so connectors can reuse them and
>> avoid code duplication, and, what is more significant, avoid possible
>> different options naming. This moment can be pointed out in
>> documentation for connector developers.
>>
>> Best regards,
>> Alexander
>>
>> вт, 17 мая 2022 г. в 17:11, Qingsheng Ren :
>> >
>> > Hi Alexander,
>> >
>> > Thanks for the review and glad to see we are on the same page! I think
>> you forgot to cc the dev mailing list so I’m also quoting your reply under
>> this email.
>> >
>> > >  We can add 'maxRetryTimes' option into this class
>> >
>> > In my opinion the retry logic should be implemented in lookup() instead
>> of in LookupFunction#eval(). Retrying is only meaningful under some
>> specific retriable failures, and there might be custom logic before making
>> retry, such as re-establish the connection (JdbcRowDataLookupFunction is an
>> example), so it's more handy to leave it to the connector.
>> >
>> > > I don't see DDL options, that were in previous version of FLIP. Do
>> you have any special plans for them?
>> >
>> > We decide not to provide common DDL options and let developers to
>> define their own options as we do now per connector.
>> >
>> > The rest of comments sound great and I’ll update the FLIP. Hope we can
>> finalize our proposal soon!
>> >
>> > Best,
>> >
>> > Qingsheng
>> >
>> >
>> > > On May 17, 2022, at 13:46, Александр Смирнов 
>> wrote:
>> > >
>> > > Hi Qingsheng and devs!
>> > >
>> > > I like the overall design of updated FLIP, however I have several
>> > > suggestions and questions.
>> > >
>> > > 1) Introducing LookupFunction as a subclass of TableFunction is a good
>> > > idea. We can add 'maxRetryTimes' option into this class. 'eval' method
>> > > of new LookupFunction is great for this purpose. The same is for
>> > > 'async' case.
>> > >
>> > > 2) There might be other configs in future, such as 'cacheMissingKey'
>> > > in LookupFunctionProvider or 'rescanInterval' in ScanRuntimeProvider.
>> > > Maybe use Builder pattern in LookupFunctionProvider and
>> > > RescanRuntimeProvider for more flexibility (use one 'build' method
>> > > instead of many 'of' methods in future)?
>> > >
>> > > 3) What are the plans for existing TableFunctionProvider and
>> > > AsyncTableFunctionProvider? I think they should be deprecated.
>> > >
>> > > 4) Am I right that the current design does not assume usage of
>> > > user-provided LookupCache in re-scanning? In this case, it is not very
>> > > clear why do we need methods such as 'invalidate' or 'putAll' in
>> > > LookupCache.
>> > >
>> > > 5) I don't see DDL options, that were in previous version of FLIP. Do
>> > > you have any special plans for them?
>> > >
>> > > If you don't mind, I would be glad to be able to make small
>> > > adjustments to the FLIP document too. I think it's worth mentioning
>> > > about what exactly optimizations are planning in the future.
>> > >
>> > > Best regards,
>> > > Smirnov Alexander
>> > >
>> > > пт, 13 мая 2022 г. в 20:27, Qingsheng Ren :
>>

Could not copy native libraries - Permission denied

2022-05-18 Thread Zain Haider Nemati

Hi,
We are using flink version 1.13 with a kafka source and a kinesis sink with
a parallelism of 3.
On submitting the job I get this error

 Could not copy native binaries to temp directory
/tmp/amazon-kinesis-producer-native-binaries
Followed by permission denied even though all the permissions have been
provided and is being run as root user. What could be causing this?

Re: [VOTE] Creating an Apache Flink slack workspace

2022-05-18 Thread Timo Walther


+1 (binding)

Thanks,
Timo


On 17.05.22 20:44, Gyula Fóra wrote:

+1 (binding)

On Tue, 17 May 2022 at 19:52, Yufei Zhang  wrote:


+1 (nonbinding)

On Tue, May 17, 2022 at 5:29 PM Márton Balassi 
wrote:


+1 (binding)

On Tue, May 17, 2022 at 11:00 AM Jingsong Li 
wrote:


Thank Xintong for driving this work.

+1

Best,
Jingsong

On Tue, May 17, 2022 at 4:49 PM Martijn Visser <

martijnvis...@apache.org


wrote:


+1 (binding)

On Tue, 17 May 2022 at 10:38, Yu Li  wrote:


+1 (binding)

Thanks Xintong for driving this!

Best Regards,
Yu


On Tue, 17 May 2022 at 16:32, Robert Metzger 

wrote:



Thanks for starting the VOTE!

+1 (binding)



On Tue, May 17, 2022 at 10:29 AM Jark Wu 

wrote:



Thank Xintong for driving this work.

+1 from my side (binding)

Best,
Jark

On Tue, 17 May 2022 at 16:24, Xintong Song <

tonysong...@gmail.com>

wrote:



Hi everyone,

As previously discussed in [1], I would like to open a vote

on

creating

an

Apache Flink slack workspace channel.

The proposed actions include:
- Creating a dedicated slack workspace with the name Apache

Flink

that

is

controlled and maintained by the Apache Flink PMC
- Updating the Flink website about rules for using various

communication

channels
- Setting up an Archive for the Apache Flink slack
- Revisiting this initiative by the end of 2022

The vote will last for at least 72 hours, and will be

accepted

by a

consensus of active PMC members.

Best,

Xintong

[jira] [Created] (FLINK-27679) Support append-only table for log store.

2022-05-18 Thread Zheng Hu (Jira)

Zheng Hu created FLINK-27679:


 Summary: Support append-only table for log store.
 Key: FLINK-27679
 URL: https://issues.apache.org/jira/browse/FLINK-27679
 Project: Flink
  Issue Type: Sub-task
Reporter: Zheng Hu


Will publish separate PR to support append-only table for log table.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27678) Support append-only table for file store.

2022-05-18 Thread Zheng Hu (Jira)

Zheng Hu created FLINK-27678:


 Summary: Support append-only table for file store.
 Key: FLINK-27678
 URL: https://issues.apache.org/jira/browse/FLINK-27678
 Project: Flink
  Issue Type: Sub-task
Reporter: Zheng Hu


Let me publish a separate PR for supporting append-only table in flink table 
store's file store.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27677) Kubernetes reuse rest.bind-port, but do not support a range of ports

2022-05-18 Thread tartarus (Jira)

tartarus created FLINK-27677:


 Summary: Kubernetes reuse rest.bind-port, but do not support a 
range of ports
 Key: FLINK-27677
 URL: https://issues.apache.org/jira/browse/FLINK-27677
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Kubernetes
Affects Versions: 1.15.0
Reporter: tartarus


k8s module reuse the rest options {color:#DE350B}rest.bind-port{color},
but do not support a range of ports

{code:java}
   /**
 * Parse a valid port for the config option. A fixed port is expected, and 
do not support a
 * range of ports.
 *
 * @param flinkConfig flink config
 * @param port port config option
 * @return valid port
 */
public static Integer parsePort(Configuration flinkConfig, 
ConfigOption port) {
checkNotNull(flinkConfig.get(port), port.key() + " should not be 
null.");

try {
return Integer.parseInt(flinkConfig.get(port));
} catch (NumberFormatException ex) {
throw new FlinkRuntimeException(
port.key()
+ " should be specified to a fixed port. Do not 
support a range of ports.",
ex);
}
}
{code}




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27676) Output records from on_timer are behind the triggering watermark in PyFlink

2022-05-18 Thread Juntao Hu (Jira)

Juntao Hu created FLINK-27676:
-

 Summary: Output records from on_timer are behind the triggering 
watermark in PyFlink
 Key: FLINK-27676
 URL: https://issues.apache.org/jira/browse/FLINK-27676
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.15.0
Reporter: Juntao Hu
 Fix For: 1.16.0


Currently, when dealing with watermarks in AbstractPythonFunctionOperator, 
super.processWatermark(mark) is called, which advances watermark in 
timeServiceManager thus triggering timers and then emit current watermark. 
However, timer triggering is not synchronous in PyFlink (processTimer only put 
data into beam buffer), and when remote bundle is closed and output records 
produced by on_timer function finally arrive at Java side, they are already 
behind the triggering watermark.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-18 Thread Jark Wu

Hi Alex,

1) retry logic
I think we can extract some common retry logic into utilities, e.g.
RetryUtils#tryTimes(times, call).
This seems independent of this FLIP and can be reused by DataStream users.
Maybe we can open an issue to discuss this and where to put it.

2) cache ConfigOptions
I'm fine with defining cache config options in the framework.
A candidate place to put is FactoryUtil which also includes
"sink.parallelism", "format" options.

Best,
Jark


On Wed, 18 May 2022 at 13:52, Александр Смирнов 
wrote:

> Hi Qingsheng,
>
> Thank you for considering my comments.
>
> >  there might be custom logic before making retry, such as re-establish
> the connection
>
> Yes, I understand that. I meant that such logic can be placed in a
> separate function, that can be implemented by connectors. Just moving
> the retry logic would make connector's LookupFunction more concise +
> avoid duplicate code. However, it's a minor change. The decision is up
> to you.
>
> > We decide not to provide common DDL options and let developers to define
> their own options as we do now per connector.
>
> What is the reason for that? One of the main goals of this FLIP was to
> unify the configs, wasn't it? I understand that current cache design
> doesn't depend on ConfigOptions, like was before. But still we can put
> these options into the framework, so connectors can reuse them and
> avoid code duplication, and, what is more significant, avoid possible
> different options naming. This moment can be pointed out in
> documentation for connector developers.
>
> Best regards,
> Alexander
>
> вт, 17 мая 2022 г. в 17:11, Qingsheng Ren :
> >
> > Hi Alexander,
> >
> > Thanks for the review and glad to see we are on the same page! I think
> you forgot to cc the dev mailing list so I’m also quoting your reply under
> this email.
> >
> > >  We can add 'maxRetryTimes' option into this class
> >
> > In my opinion the retry logic should be implemented in lookup() instead
> of in LookupFunction#eval(). Retrying is only meaningful under some
> specific retriable failures, and there might be custom logic before making
> retry, such as re-establish the connection (JdbcRowDataLookupFunction is an
> example), so it's more handy to leave it to the connector.
> >
> > > I don't see DDL options, that were in previous version of FLIP. Do you
> have any special plans for them?
> >
> > We decide not to provide common DDL options and let developers to define
> their own options as we do now per connector.
> >
> > The rest of comments sound great and I’ll update the FLIP. Hope we can
> finalize our proposal soon!
> >
> > Best,
> >
> > Qingsheng
> >
> >
> > > On May 17, 2022, at 13:46, Александр Смирнов 
> wrote:
> > >
> > > Hi Qingsheng and devs!
> > >
> > > I like the overall design of updated FLIP, however I have several
> > > suggestions and questions.
> > >
> > > 1) Introducing LookupFunction as a subclass of TableFunction is a good
> > > idea. We can add 'maxRetryTimes' option into this class. 'eval' method
> > > of new LookupFunction is great for this purpose. The same is for
> > > 'async' case.
> > >
> > > 2) There might be other configs in future, such as 'cacheMissingKey'
> > > in LookupFunctionProvider or 'rescanInterval' in ScanRuntimeProvider.
> > > Maybe use Builder pattern in LookupFunctionProvider and
> > > RescanRuntimeProvider for more flexibility (use one 'build' method
> > > instead of many 'of' methods in future)?
> > >
> > > 3) What are the plans for existing TableFunctionProvider and
> > > AsyncTableFunctionProvider? I think they should be deprecated.
> > >
> > > 4) Am I right that the current design does not assume usage of
> > > user-provided LookupCache in re-scanning? In this case, it is not very
> > > clear why do we need methods such as 'invalidate' or 'putAll' in
> > > LookupCache.
> > >
> > > 5) I don't see DDL options, that were in previous version of FLIP. Do
> > > you have any special plans for them?
> > >
> > > If you don't mind, I would be glad to be able to make small
> > > adjustments to the FLIP document too. I think it's worth mentioning
> > > about what exactly optimizations are planning in the future.
> > >
> > > Best regards,
> > > Smirnov Alexander
> > >
> > > пт, 13 мая 2022 г. в 20:27, Qingsheng Ren :
> > >>
> > >> Hi Alexander and devs,
> > >>
> > >> Thank you very much for the in-depth discussion! As Jark mentioned we
> were inspired by Alexander's idea and made a refactor on our design.
> FLIP-221 [1] has been updated to reflect our design now and we are happy to
> hear more suggestions from you!
> > >>
> > >> Compared to the previous design:
> > >> 1. The lookup cache serves at table runtime level and is integrated
> as a component of LookupJoinRunner as discussed previously.
> > >> 2. Interfaces are renamed and re-designed to reflect the new design.
> > >> 3. We separate the all-caching case individually and introduce a new
> RescanRuntimeProvider to reuse the ability of scanning. We are planning to
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

2022-05-18 Thread Paul Lam

Hi Godfrey,

Thanks a lot for your inputs!

'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs (DataStream or 
SQL) or 
clients (SQL client or CLI). Under the hook, it’s based on 
ClusterClient#listJobs, the 
same with Flink CLI. I think it’s okay to have non-SQL jobs listed in SQL 
client, because
these jobs can be managed via SQL client too.

WRT finished time, I think you’re right. Adding it to the FLIP. But I’m a bit 
afraid that the
rows would be too long.

WRT ‘DROP QUERY’,
> What's the behavior for batch jobs and the non-running jobs?


In general, the behavior would be aligned with Flink CLI. Triggering a 
savepoint for 
a non-running job would cause errors, and the error message would be printed to
the SQL client. Triggering a savepoint for batch(unbounded) jobs in streaming
execution mode would be the same with streaming jobs. However, for batch jobs 
in 
batch execution mode, I think there would be an error, because batch execution
doesn’t support checkpoints currently (please correct me if I’m wrong).

WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink clusterClient/
jobClient doesn’t have such a functionality at the moment, neither do Flink CLI.
Maybe we could make it a follow-up FLIP, which includes the modifications to 
clusterClient/jobClient and Flink CLI. WDYT?

Best,
Paul Lam

> 2022年5月17日 20:34，godfrey he  写道：
> 
> Godfrey

Re: [VOTE] FLIP-229: Introduces Join Hint for Flink SQL Batch Job

2022-05-18 Thread Jingsong Li

+1 Thanks for driving.

Best,
Jingsong

On Wed, May 18, 2022 at 1:33 PM godfrey he  wrote:

> Thanks Xuyang for driving this, +1(binding)
>
> Best,
> Godfrey
>
> Xuyang  于2022年5月17日周二 10:21写道：
> >
> > Hi, everyone.
> > Thanks for your feedback for FLIP-229: Introduces Join Hint for Flink
> SQL Batch Job[1] on the discussion thread[2].
> > I'd like to start a vote for it. The vote will be open for at least 72
> hours unless there is an objection or not enough votes.
> >
> > --
> >
> > Best！
> > Xuyang
> >
> >
> > [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-229%3A+Introduces+Join+Hint+for+Flink+SQL+Batch+Job
> > [2] https://lists.apache.org/thread/y668bxyjz66ggtjypfz9t571m0tyvv9h
>

37 matches

Mail list logo