from:"Ferenc Csaky"

Re: [DISCUSS] Merge "flink run" and "flink run-application" in Flink 2.0

2024-06-07 Thread Ferenc Csaky

Hi,

Thank you everyone for the valuable comments, if there are no new messages by 
then, I will start a vote on Monday.

Thanks,
Ferenc




On Monday, 3 June 2024 at 17:27, Jeyhun Karimov  wrote:

> 
> 
> Hi Ferenc,
> 
> Thanks for the proposal. +1 for it! This FLIP will improve the user
> experience.
> 
> Regards,
> Jeyhun
> 
> 
> 
> 
> 
> On Mon, Jun 3, 2024 at 1:50 PM Ferenc Csaky ferenc.cs...@pm.me.invalid
> 
> wrote:
> 
> > Hi Hang,
> > 
> > Thank you for your inputs, both points make sense, updated the
> > FLIP according to them.
> > 
> > Best,
> > Ferenc
> > 
> > On Friday, 31 May 2024 at 04:31, Hang Ruan ruanhang1...@gmail.com wrote:
> > 
> > > Hi, Ferenc.
> > > 
> > > +1 for this proposal. This FLIP will help to make the CLI clearer for
> > > users.
> > > 
> > > I think we should better add an example in the FLIP about how to use the
> > > application mode with the new CLI.
> > > Besides that, we need to add some new tests for this change instead of
> > > only
> > > using the existed tests.
> > > 
> > > Best,
> > > Hang
> > > 
> > > Mate Czagany czmat...@gmail.com 于2024年5月29日周三 19:57写道：
> > > 
> > > > Hi Ferenc,
> > > > 
> > > > Thanks for the FLIP, +1 from me for the proposal. I think these changes
> > > > would be a great solution to all the confusion that comes from these
> > > > two
> > > > action parameters.
> > > > 
> > > > Best regards,
> > > > Mate
> > > > 
> > > > Ferenc Csaky ferenc.cs...@pm.me.invalid ezt írta (időpont: 2024. máj.
> > > > 28., K, 16:13):
> > > > 
> > > > > Thank you Xintong for your input.
> > > > > 
> > > > > I prepared a FLIP for this change [1], looking forward for any
> > > > > other opinions.
> > > > > 
> > > > > Thanks,
> > > > > Ferenc
> > > > > 
> > > > > [1]
> > 
> > https://docs.google.com/document/d/1EX74rFp9bMKdfoGkz1ASOM6Ibw32rRxIadX72zs2zoY/edit?usp=sharing
> > 
> > > > > On Friday, 17 May 2024 at 07:04, Xintong Song tonysong...@gmail.com
> > > > > wrote:
> > > > > 
> > > > > > AFAIK, the main purpose of having `run-application` was to make
> > > > > > sure
> > > > > > the user is aware that application mode is used, which executes the
> > > > > > main
> > > > > > method of the user program in JM rather than in client. This was
> > > > > > important
> > > > > > at the time application mode was first introduced, but maybe not
> > > > > > that
> > > > > > important anymore, given that per-job mode is deprecated and likely
> > > > > > removed
> > > > > > in 2.0. Therefore, +1 for the proposal.
> > > > > > 
> > > > > > Best,
> > > > > > 
> > > > > > Xintong
> > > > > > 
> > > > > > On Thu, May 16, 2024 at 11:35 PM Ferenc Csaky
> > > > > > ferenc.cs...@pm.me.invalid
> > > > > > 
> > > > > > wrote:
> > > > > > 
> > > > > > > Hello devs,
> > > > > > > 
> > > > > > > I saw quite some examples when customers were confused about
> > > > > > > run, and
> > > > > > > run-
> > > > > > > application in the Flink CLI and I was wondering about the
> > > > > > > necessity
> > > > > > > of
> > > > > > > deploying
> > > > > > > Application Mode (AM) jobs with a different command, than
> > > > > > > Session and
> > > > > > > Per-Job mode jobs.
> > > > > > > 
> > > > > > > I can see a point that YarnDeploymentTarget [1] and
> > > > > > > KubernetesDeploymentTarget
> > > > > > > [2] are part of their own maven modules and not known in
> > > > > > > flink-clients,
> > > > > > > so the
> > > > > > > deployment mode validations are happening during cluster
> > > > > > > deployment
> > > > > > > in
> > > > > > > their specific
> > > > > >

Re: [DISCUSS] Merge "flink run" and "flink run-application" in Flink 2.0

2024-06-03 Thread Ferenc Csaky

Hi Hang,

Thank you for your inputs, both points make sense, updated the
FLIP according to them.

Best,
Ferenc




On Friday, 31 May 2024 at 04:31, Hang Ruan  wrote:

> 
> 
> Hi, Ferenc.
> 
> +1 for this proposal. This FLIP will help to make the CLI clearer for users.
> 
> I think we should better add an example in the FLIP about how to use the
> application mode with the new CLI.
> Besides that, we need to add some new tests for this change instead of only
> using the existed tests.
> 
> Best,
> Hang
> 
> Mate Czagany czmat...@gmail.com 于2024年5月29日周三 19:57写道：
> 
> > Hi Ferenc,
> > 
> > Thanks for the FLIP, +1 from me for the proposal. I think these changes
> > would be a great solution to all the confusion that comes from these two
> > action parameters.
> > 
> > Best regards,
> > Mate
> > 
> > Ferenc Csaky ferenc.cs...@pm.me.invalid ezt írta (időpont: 2024. máj.
> > 28., K, 16:13):
> > 
> > > Thank you Xintong for your input.
> > > 
> > > I prepared a FLIP for this change [1], looking forward for any
> > > other opinions.
> > > 
> > > Thanks,
> > > Ferenc
> > > 
> > > [1]
> > 
> > https://docs.google.com/document/d/1EX74rFp9bMKdfoGkz1ASOM6Ibw32rRxIadX72zs2zoY/edit?usp=sharing
> > 
> > > On Friday, 17 May 2024 at 07:04, Xintong Song tonysong...@gmail.com
> > > wrote:
> > > 
> > > > AFAIK, the main purpose of having `run-application` was to make sure
> > > > the user is aware that application mode is used, which executes the
> > > > main
> > > > method of the user program in JM rather than in client. This was
> > > > important
> > > > at the time application mode was first introduced, but maybe not that
> > > > important anymore, given that per-job mode is deprecated and likely
> > > > removed
> > > > in 2.0. Therefore, +1 for the proposal.
> > > > 
> > > > Best,
> > > > 
> > > > Xintong
> > > > 
> > > > On Thu, May 16, 2024 at 11:35 PM Ferenc Csaky
> > > > ferenc.cs...@pm.me.invalid
> > > > 
> > > > wrote:
> > > > 
> > > > > Hello devs,
> > > > > 
> > > > > I saw quite some examples when customers were confused about run, and
> > > > > run-
> > > > > application in the Flink CLI and I was wondering about the necessity
> > > > > of
> > > > > deploying
> > > > > Application Mode (AM) jobs with a different command, than Session and
> > > > > Per-Job mode jobs.
> > > > > 
> > > > > I can see a point that YarnDeploymentTarget [1] and
> > > > > KubernetesDeploymentTarget
> > > > > [2] are part of their own maven modules and not known in
> > > > > flink-clients,
> > > > > so the
> > > > > deployment mode validations are happening during cluster deployment
> > > > > in
> > > > > their specific
> > > > > ClusterDescriptor implementation [3]. Although these are
> > > > > implementation
> > > > > details that
> > > > > IMO should not define user-facing APIs.
> > > > > 
> > > > > The command line setup is the same for both run and run-application,
> > > > > so
> > > > > I think there
> > > > > is a quite simple way to achieve a unified flink run experience, but
> > > > > I
> > > > > might missed
> > > > > something so I would appreciate any inputs on this topic.
> > > > > 
> > > > > Based on my assumptions I think it would be possible to deprecate the
> > > > > run-
> > > > > application in Flink 1.20 and remove it completely in Flink 2.0. I
> > > > > already put together a
> > > > > PoC [4], and I was able to deploy AM jobs like this:
> > > > > 
> > > > > flink run --target kubernetes-application ...
> > > > > 
> > > > > If others also agree with this, I would be happy to open a FLIP.
> > > > > WDYT?
> > > > > 
> > > > > Thanks,
> > > > > Ferenc
> > > > > 
> > > > > [1]
> > 
> > https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/configuration/YarnDeploymentTarget.java
> > 
> > > > > [2]
> > 
> > https://github.com/apache/flink/blob/master/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/configuration/KubernetesDeploymentTarget.java
> > 
> > > > > [3]
> > 
> > https://github.com/apache/flink/blob/48e5a39c9558083afa7589d2d8b054b625f61ee9/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/KubernetesClusterDescriptor.java#L206
> > 
> > > > > [4]
> > 
> > https://github.com/ferenc-csaky/flink/commit/40b3e1b998c7a4273eaaff71d9162c9f1ee039c0

Re: [DISCUSS] Merge "flink run" and "flink run-application" in Flink 2.0

2024-05-28 Thread Ferenc Csaky

Thank you Xintong for your input.

I prepared a FLIP for this change [1], looking forward for any
other opinions.

Thanks,
Ferenc

[1] 
https://docs.google.com/document/d/1EX74rFp9bMKdfoGkz1ASOM6Ibw32rRxIadX72zs2zoY/edit?usp=sharing



On Friday, 17 May 2024 at 07:04, Xintong Song  wrote:

> 
> 
> AFAIK, the main purpose of having `run-application` was to make sure
> the user is aware that application mode is used, which executes the main
> method of the user program in JM rather than in client. This was important
> at the time application mode was first introduced, but maybe not that
> important anymore, given that per-job mode is deprecated and likely removed
> in 2.0. Therefore, +1 for the proposal.
> 
> Best,
> 
> Xintong
> 
> 
> 
> On Thu, May 16, 2024 at 11:35 PM Ferenc Csaky ferenc.cs...@pm.me.invalid
> 
> wrote:
> 
> > Hello devs,
> > 
> > I saw quite some examples when customers were confused about run, and run-
> > application in the Flink CLI and I was wondering about the necessity of
> > deploying
> > Application Mode (AM) jobs with a different command, than Session and
> > Per-Job mode jobs.
> > 
> > I can see a point that YarnDeploymentTarget [1] and
> > KubernetesDeploymentTarget
> > [2] are part of their own maven modules and not known in flink-clients,
> > so the
> > deployment mode validations are happening during cluster deployment in
> > their specific
> > ClusterDescriptor implementation [3]. Although these are implementation
> > details that
> > IMO should not define user-facing APIs.
> > 
> > The command line setup is the same for both run and run-application, so
> > I think there
> > is a quite simple way to achieve a unified flink run experience, but I
> > might missed
> > something so I would appreciate any inputs on this topic.
> > 
> > Based on my assumptions I think it would be possible to deprecate the run-
> > application in Flink 1.20 and remove it completely in Flink 2.0. I
> > already put together a
> > PoC [4], and I was able to deploy AM jobs like this:
> > 
> > flink run --target kubernetes-application ...
> > 
> > If others also agree with this, I would be happy to open a FLIP. WDYT?
> > 
> > Thanks,
> > Ferenc
> > 
> > [1]
> > https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/configuration/YarnDeploymentTarget.java
> > [2]
> > https://github.com/apache/flink/blob/master/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/configuration/KubernetesDeploymentTarget.java
> > [3]
> > https://github.com/apache/flink/blob/48e5a39c9558083afa7589d2d8b054b625f61ee9/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/KubernetesClusterDescriptor.java#L206
> > [4]
> > https://github.com/ferenc-csaky/flink/commit/40b3e1b998c7a4273eaaff71d9162c9f1ee039c0

[DISCUSS] Merge "flink run" and "flink run-application" in Flink 2.0

2024-05-16 Thread Ferenc Csaky

Hello devs,

I saw quite some examples when customers were confused about run, and run-
application in the Flink CLI and I was wondering about the necessity of 
deploying
Application Mode (AM) jobs with a different command, than Session and Per-Job 
mode jobs.

I can see a point that YarnDeploymentTarget [1] and KubernetesDeploymentTarget
[2] are part of their own maven modules and not known in flink-clients, so the
deployment mode validations are happening during cluster deployment in their 
specific
ClusterDescriptor implementation [3]. Although these are implementation 
details that
IMO should not define user-facing APIs.

The command line setup is the same for both run and run-application, so I 
think there
is a quite simple way to achieve a unified flink run experience, but I might 
missed
something so I would appreciate any inputs on this topic.

Based on my assumptions I think it would be possible to deprecate the run-
application in Flink 1.20 and remove it completely in Flink 2.0. I already put 
together a
PoC [4], and I was able to deploy AM jobs like this:

flink run --target kubernetes-application ...

If others also agree with this, I would be happy to open a FLIP. WDYT?

Thanks,
Ferenc

[1] 
https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/configuration/YarnDeploymentTarget.java
[2] 
https://github.com/apache/flink/blob/master/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/configuration/KubernetesDeploymentTarget.java
[3] 
https://github.com/apache/flink/blob/48e5a39c9558083afa7589d2d8b054b625f61ee9/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/KubernetesClusterDescriptor.java#L206
[4] 
https://github.com/ferenc-csaky/flink/commit/40b3e1b998c7a4273eaaff71d9162c9f1ee039c0

Re: [VOTE] FLIP-453: Promote Unified Sink API V2 to Public and Deprecate SinkFunction

2024-05-14 Thread Ferenc Csaky

+1 (non-binding)

Thanks,
Ferenc




On Tuesday, 14 May 2024 at 08:51, weijie guo  wrote:

> 
> 
> Thanks Martijn for the effort!
> 
> +1(binding)
> 
> Best regards,
> 
> Weijie
> 
> 
> Martijn Visser martijnvis...@apache.org 于2024年5月14日周二 14:45写道：
> 
> > Hi everyone,
> > 
> > With no more discussions being open in the thread [1] I would like to start
> > a vote on FLIP-453: Promote Unified Sink API V2 to Public and Deprecate
> > SinkFunction [2]
> > 
> > The vote will be open for at least 72 hours unless there is an objection or
> > insufficient votes.
> > 
> > Best regards,
> > 
> > Martijn
> > 
> > [1] https://lists.apache.org/thread/hod6bg421bzwhbfv60lwsck7r81dvo59
> > [2]
> > 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-453%3A+Promote+Unified+Sink+API+V2+to+Public+and+Deprecate+SinkFunction

Re: [DISCUSS] FLIP-453: Promote Unified Sink API V2 to Public and Deprecate SinkFunction

2024-05-03 Thread Ferenc Csaky

Hi Martijn,

+1 for the proposal.

> targeted for Flink 1.19

I guess you meant Flink 1.20 here.

Also, I volunteer to take updating the HBase sink, feel free to assign that 
task to me.

Best,
Ferenc




On Friday, May 3rd, 2024 at 10:20, Martijn Visser  
wrote:

> 
> 
> Hi Peter,
> 
> I'll add it for completeness, thanks!
> With regards to FLINK-35149, the fix version indicates a change at Flink
> CDC; is that indeed correct, or does it require a change in the SinkV2
> interface?
> 
> Best regards,
> 
> Martijn
> 
> 
> On Fri, May 3, 2024 at 7:47 AM Péter Váry peter.vary.apa...@gmail.com
> 
> wrote:
> 
> > Hi Martijn,
> > 
> > We might want to add FLIP-371 [1] to the list. (Or we aim only for higher
> > level FLIPs?)
> > 
> > We are in the process of using the new API in Iceberg connector [2] - so
> > far, so good.
> > 
> > I know of one minor known issue about the sink [3], which should be ready
> > for the release.
> > 
> > All-in-all, I think we are in good shape, and we could move forward with
> > the promotion.
> > 
> > Thanks,
> > Peter
> > 
> > [1] -
> > 
> > https://cwiki.apache.org/confluence/plugins/servlet/mobile?contentId=263430387
> > [2] - https://github.com/apache/iceberg/pull/10179
> > [3] - https://issues.apache.org/jira/browse/FLINK-35149
> > 
> > On Thu, May 2, 2024, 09:47 Muhammet Orazov mor+fl...@morazow.com.invalid
> > wrote:
> > 
> > > Got it, thanks!
> > > 
> > > On 2024-05-02 06:53, Martijn Visser wrote:
> > > 
> > > > Hi Muhammet,
> > > > 
> > > > Thanks for joining the discussion! The changes in this FLIP would be
> > > > targeted for Flink 1.19, since it's only a matter of changing the
> > > > annotation.
> > > > 
> > > > Best regards,
> > > > 
> > > > Martijn
> > > > 
> > > > On Thu, May 2, 2024 at 7:26 AM Muhammet Orazov mor+fl...@morazow.com
> > > > wrote:
> > > > 
> > > > > Hello Martijn,
> > > > > 
> > > > > Thanks for the FLIP and detailed history of changes, +1.
> > > > > 
> > > > > Would FLIP changes target for 2.0? I think it would be good
> > > > > to have clear APIs on 2.0 release.
> > > > > 
> > > > > Best,
> > > > > Muhammet
> > > > > 
> > > > > On 2024-05-01 15:30, Martijn Visser wrote:
> > > > > 
> > > > > > Hi everyone,
> > > > > > 
> > > > > > I would like to start a discussion on FLIP-453: Promote Unified Sink
> > > > > > API V2
> > > > > > to Public and Deprecate SinkFunction
> > > > > > https://cwiki.apache.org/confluence/x/rIobEg
> > > > > > 
> > > > > > This FLIP proposes to promote the Unified Sink API V2 from
> > > > > > PublicEvolving
> > > > > > to Public and to mark the SinkFunction as Deprecated.
> > > > > > 
> > > > > > I'm looking forward to your thoughts.
> > > > > > 
> > > > > > Best regards,
> > > > > > 
> > > > > > Martijn

Re: [VOTE] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-24 Thread Ferenc Csaky

+1 (non-binding), looking forward to this!

Best,
Ferenc




On Wednesday, April 24th, 2024 at 10:03, Mate Czagany  
wrote:

> 
> 
> Hi everyone,
> 
> I'd like to start a vote on the FLIP-446: Kubernetes Operator State
> Snapshot CRD [1]. The discussion thread is here [2].
> 
> The vote will be open for at least 72 hours unless there is an objection or
> insufficient votes.
> 
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-446%3A+Kubernetes+Operator+State+Snapshot+CRD
> [2] https://lists.apache.org/thread/q5dzjwj0qk34rbg2sczyypfhokxoc3q7
> 
> Regards,
> Mate

Re: [DISCUSS] FLIP-438: Make Flink's Hadoop and YARN configuration probing consistent

2024-04-18 Thread Ferenc Csaky

Hi Venkata krishnan,

My general point was that personally I do not think that the
current implementation is wrong or confusing. And the main thing
here is that how we define consistent in this case is subjective.
>From the proposed point of view, consistent mean we use the same
prefix. But we can consistently use the least required prefix
groups to identify a subsystem property, which is how it works
right now. The prop naming conventions of these dependent systems
are different, so do their prefixes in Flink.

It is very possible I am in minority with my view,
but I do not think duplicating the `yarn` prefix to make it
conform with Hadoop props would be a better UX as it is now, just
different. 

One thing that less opinionated maybe is if the proposed solution
simplifies the property load logic. Currently, Hadoop props from
the Flink conf are parsed in `HadoopUtils` (flink-hadoop-fs
module), while Yarn props in `Utils` (flink-yarn module). Maybe
`org.apache.flink.configuration.Configuration` could have a helper
to extract all prefixed properties to a `Map`, or another
`Configuration` object (the latter could be easily added as a
resource to the dependent system config objects).

That simplification could make the overall prefixed load logic
more clean IMO and that is something that would be useful.

WDYT?

Best,
Ferenc





On Monday, April 15th, 2024 at 20:55, Venkatakrishnan Sowrirajan 
 wrote:

> 
> 
> Sorry for the late reply, Ferenc.
> 
> I understand the rationale behind the current implementation as the problem
> is slightly different b/w yarn (always prefixed with `yarn`) and hadoop (it
> is not guaranteed all `hadoop` configs will be prefixed by `hadoop`)
> configs.
> 
> From the dev UX perspective, it is confusing and only if you really pay
> close attention to the docs it is evident. I understand your point on added
> complexity till Flink-3.0 but if we agree it should be made consistent, it
> has to be done at some point of time right?
> 
> Regards
> Venkata krishnan
> 
> 
> On Wed, Apr 3, 2024 at 4:51 AM Ferenc Csaky ferenc.cs...@pm.me.invalid
> 
> wrote:
> 
> > Hi Venkata,
> > 
> > Thank you for opening the discussion about this!
> > 
> > After taking a look at the YARN and Hadoop configurations, the
> > reason why it was implemented this way is that, in case of YARN,
> > every YARN-specific property is prefixed with "yarn.", so to get
> > the final, YARN-side property it is enough to remove the "flink."
> > prefix.
> > 
> > In case of Hadoop, there are properties that not prefixed with
> > "hadoop.", e.g. "dfs.replication" so to identify and get the
> > Hadoop-side property it is necessary to duplicate the "hadoop" part
> > in the properties.
> > 
> > Taking this into consideration I would personally say -0 to this
> > change. IMO the current behavior can be justified as giving
> > slightly different solutions to slightly different problems, which
> > are well documented. Handling both prefixes would complicate the
> > parsing logic until the APIs can be removed, which as it looks at
> > the moment would only be possible in Flink 3.0, which probably will
> > not happen in the foreseeable future, so I do not see the benefit
> > of the added complexity.
> > 
> > Regarding the FLIP, in the "YARN configuration override example"
> > part, I think you should present an example that works correctly
> > at the moment: "flink.yarn.application.classpath" ->
> > "yarn.application.classpath".
> > 
> > Best,
> > Ferenc
> > 
> > On Friday, March 29th, 2024 at 23:45, Venkatakrishnan Sowrirajan <
> > vsowr...@asu.edu> wrote:
> > 
> > > Hi Flink devs,
> > > 
> > > I would like to start a discussion on FLIP-XXX: Make Flink's Hadoop and
> > > YARN configuration probing consistent
> > 
> > https://urldefense.com/v3/__https://docs.google.com/document/d/1I2jBFI0eVkofAVCAEeajNQRfOqKGJsRfZd54h79AIYc/edit?usp=sharing__;!!IKRxdwAv5BmarQ!d0XJO_mzLCJZNkrjJDMyRGP95zPLW8Cuym88l7CoAUG8aD_KRYJbll3K-q1Ypplyqe6-jcsWq3S8YJqrDMCpK4IhpT4cZPXy$
> > .
> > 
> > > This stems from an earlier discussion thread here
> > 
> > https://urldefense.com/v3/__https://lists.apache.org/thread/l2fh5shbf59fjgbt1h73pmmsqj038ppv__;!!IKRxdwAv5BmarQ!d0XJO_mzLCJZNkrjJDMyRGP95zPLW8Cuym88l7CoAUG8aD_KRYJbll3K-q1Ypplyqe6-jcsWq3S8YJqrDMCpK4IhpW60A99X$
> > .
> > 
> > > This FLIP is proposing to make the configuration probing behavior between
> > > Hadoop and YARN configuration to be consistent.
> > > 
> > > Regards
> > > Venkata krishnan

Re: [VOTE] FLIP-435: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-04-17 Thread Ferenc Csaky

+1 (non-binding)

Best,
Ferenc




On Wednesday, April 17th, 2024 at 10:26, Ahmed Hamdy  
wrote:

> 
> 
> + 1 (non-binding)
> 
> Best Regards
> Ahmed Hamdy
> 
> 
> On Wed, 17 Apr 2024 at 08:28, Yuepeng Pan panyuep...@apache.org wrote:
> 
> > +1(non-binding).
> > 
> > Best,
> > Yuepeng Pan
> > 
> > At 2024-04-17 14:27:27, "Ron liu" ron9@gmail.com wrote:
> > 
> > > Hi Dev,
> > > 
> > > Thank you to everyone for the feedback on FLIP-435: Introduce a New
> > > Materialized Table for Simplifying Data Pipelines[1][2].
> > > 
> > > I'd like to start a vote for it. The vote will be open for at least 72
> > > hours unless there is an objection or not enough votes.
> > > 
> > > [1]
> > 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-435%3A+Introduce+a+New+Materialized+Table+for+Simplifying+Data+Pipelines
> > 
> > > [2] https://lists.apache.org/thread/c1gnn3bvbfs8v1trlf975t327s4rsffs
> > > 
> > > Best,
> > > Ron

Re: [DISCUSS] Connector releases for Flink 1.19

2024-04-17 Thread Ferenc Csaky

Thank you Danny and Sergey for pushing this!

I can help with the HBase connector if necessary, will comment the
details to the relevant Jira ticket.

Best,
Ferenc




On Wednesday, April 17th, 2024 at 11:17, Danny Cranmer 
 wrote:

> 
> 
> Hello all,
> 
> I have created a parent Jira to cover the releases [1]. I have assigned AWS
> and MongoDB to myself and OpenSearch to Sergey. Please assign the
> relevant issue to yourself as you pick up the tasks.
> 
> Thanks!
> 
> [1] https://issues.apache.org/jira/browse/FLINK-35131
> 
> On Tue, Apr 16, 2024 at 2:41 PM Muhammet Orazov
> mor+fl...@morazow.com.invalid wrote:
> 
> > Thanks Sergey and Danny for clarifying, indeed it
> > requires committer to go through the process.
> > 
> > Anyway, please let me know if I can be any help.
> > 
> > Best,
> > Muhammet
> > 
> > On 2024-04-16 11:19, Danny Cranmer wrote:
> > 
> > > Hello,
> > > 
> > > I have opened the VOTE thread for the AWS connectors release [1].
> > > 
> > > > If I'm not mistaking (please correct me if I'm wrong) this request is
> > > > not
> > > > about version update it is about new releases for connectors
> > > 
> > > Yes, correct. If there are any other code changes required then help
> > > would be appreciated.
> > > 
> > > > Are you going to create an umbrella issue for it?
> > > 
> > > We do not usually create JIRA issues for releases. That being said it
> > > sounds like a good idea to have one place to track the status of the
> > > connector releases and pre-requisite code changes.
> > > 
> > > > I would like to work on this task, thanks for initiating it!
> > > 
> > > The actual release needs to be performed by a committer. However, help
> > > getting the connectors building against Flink 1.19 and testing the RC
> > > is
> > > appreciated.
> > > 
> > > Thanks,
> > > Danny
> > > 
> > > [1] https://lists.apache.org/thread/0nw9smt23crx4gwkf6p1dd4jwvp1g5s0
> > > 
> > > On Tue, Apr 16, 2024 at 6:34 AM Sergey Nuyanzin snuyan...@gmail.com
> > > wrote:
> > > 
> > > > Thanks for volunteering Muhammet!
> > > > And thanks Danny for starting the activity.
> > > > 
> > > > If I'm not mistaking (please correct me if I'm wrong)
> > > > 
> > > > this request is not about version update it is about new releases for
> > > > connectors
> > > > btw for jdbc connector support of 1.19 and 1.20-SNAPSHOT is already
> > > > done
> > > > 
> > > > I would volunteer for Opensearch connector since currently I'm working
> > > > on
> > > > support of Opensearch v2
> > > > and I think it would make sense to have a release after it is done
> > > > 
> > > > On Tue, Apr 16, 2024 at 4:29 AM Muhammet Orazov
> > > > mor+fl...@morazow.com.invalid wrote:
> > > > 
> > > > > Hello Danny,
> > > > > 
> > > > > I would like to work on this task, thanks for initiating it!
> > > > > 
> > > > > I could update the versions on JDBC and Pulsar connectors.
> > > > > 
> > > > > Are you going to create an umbrella issue for it?
> > > > > 
> > > > > Best,
> > > > > Muhammet
> > > > > 
> > > > > On 2024-04-15 13:44, Danny Cranmer wrote:
> > > > > 
> > > > > > Hello all,
> > > > > > 
> > > > > > Flink 1.19 was released on 2024-03-18 [1] and the connectors have 
> > > > > > not
> > > > > > yet
> > > > > > caught up. I propose we start releasing the connectors with support
> > > > > > for
> > > > > > Flink 1.19 as per the connector support guidelines [2].
> > > > > > 
> > > > > > I will make a start on flink-connector-aws, then pickup others in 
> > > > > > the
> > > > > > coming days. Please respond to the thread if you are/want to work on
> > > > > > a
> > > > > > particular connector to avoid duplicate work.
> > > > > > 
> > > > > > Thanks,
> > > > > > Danny
> > > > > > 
> > > > > > [1]
> > 
> > https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/
> > 
> > > > > > [2]
> > 
> > https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development#ExternalizedConnectordevelopment-Flinkcompatibility
> > 
> > > > > > [3] https://github.com/apache/flink-connector-aws
> > > > 
> > > > --
> > > > Best regards,
> > > > Sergey

Re: [ANNOUNCE] New Apache Flink Committer - Zakelly Lan

2024-04-16 Thread Ferenc Csaky

Congratulations!

Best,
Ferenc




On Tuesday, April 16th, 2024 at 16:28, Jeyhun Karimov  
wrote:

> 
> 
> Congratulations Zakelly!
> 
> Regards,
> Jeyhun
> 
> On Tue, Apr 16, 2024 at 6:35 AM Feifan Wang zoltar9...@163.com wrote:
> 
> > Congratulations, Zakelly!——
> > 
> > Best regards,
> > 
> > Feifan Wang
> > 
> > At 2024-04-15 10:50:06, "Yuan Mei" yuanmei.w...@gmail.com wrote:
> > 
> > > Hi everyone,
> > > 
> > > On behalf of the PMC, I'm happy to let you know that Zakelly Lan has
> > > become
> > > a new Flink Committer!
> > > 
> > > Zakelly has been continuously contributing to the Flink project since
> > > 2020,
> > > with a focus area on Checkpointing, State as well as frocksdb (the default
> > > on-disk state db).
> > > 
> > > He leads several FLIPs to improve checkpoints and state APIs, including
> > > File Merging for Checkpoints and configuration/API reorganizations. He is
> > > also one of the main contributors to the recent efforts of "disaggregated
> > > state management for Flink 2.0" and drives the entire discussion in the
> > > mailing thread, demonstrating outstanding technical depth and breadth of
> > > knowledge.
> > > 
> > > Beyond his technical contributions, Zakelly is passionate about helping
> > > the
> > > community in numerous ways. He spent quite some time setting up the Flink
> > > Speed Center and rebuilding the benchmark pipeline after the original one
> > > was out of lease. He helps build frocksdb and tests for the upcoming
> > > frocksdb release (bump rocksdb from 6.20.3->8.10).
> > > 
> > > Please join me in congratulating Zakelly for becoming an Apache Flink
> > > committer!
> > > 
> > > Best,
> > > Yuan (on behalf of the Flink PMC)

Re: [DISCUSS] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-16 Thread Ferenc Csaky

Thank you Mate for initiating this discussion. +1 for this idea.
Some Qs:

Can you specify the newly introduced configurations in more
details? Currently, it is not fully clear to me what are the
possible values of `kubernetes.operator.periodic.savepoint.mode`,
is it optional, has a default value?

I see that in `SavepointSpec.formatType` has a default, although
`CheckppointSpec.checkpointType` not. Are we inferring that from
the config? My point is, in general I think it would be good to
handle the two snapshot types in a similar way when it makes sense
to minimize any kind of confusion.

Best,
Ferenc

On Tuesday, April 16th, 2024 at 11:34, Mate Czagany  wrote:

> 
> 
> Hi Everyone,
> 
> I would like to start a discussion on FLIP-446: Kubernetes Operator State
> Snapshot CRD.
> 
> This FLIP adds a new custom resource for Operator users to create and
> manage their savepoints and checkpoints. I have also developed an initial
> POC to prove that this approach is feasible, you can find the link for that
> in the FLIP.
> 
> There is a Confluence page [1] and a Google Docs page [2] as I do not have
> a Confluence account yet.
> 
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-446%3A+Kubernetes+Operator+State+Snapshot+CRD
> [2]
> https://docs.google.com/document/d/1VdfLFaE4i6ESbCQ38CH7TKOiPQVvXeOxNV2FeSMnOTg
> 
> 
> Regards,
> Mate

[jira] [Created] (FLINK-35114) Remove old Table API implementations

2024-04-15 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-35114:


 Summary: Remove old Table API implementations
 Key: FLINK-35114
 URL: https://issues.apache.org/jira/browse/FLINK-35114
 Project: Flink
  Issue Type: Sub-task
Reporter: Ferenc Csaky


At the moment, the connector has both the old Table sink/source/catalog 
implementations and the matching Dynamic... implementations as well.

Going forward, the deprecated old implementation should be removed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [ANNOUNCE] New Apache Flink PMC Member - Lincoln Lee

2024-04-12 Thread Ferenc Csaky

Congratulations, Lincoln!

Best,
Ferenc




On Friday, April 12th, 2024 at 15:54, lorenzo.affe...@ververica.com.INVALID 
 wrote:

> 
> 
> Huge congrats! Well done!
> On Apr 12, 2024 at 13:56 +0200, Ron liu ron9@gmail.com, wrote:
> 
> > Congratulations, Lincoln!
> > 
> > Best,
> > Ron
> > 
> > Junrui Lee jrlee@gmail.com 于2024年4月12日周五 18:54写道：
> > 
> > > Congratulations, Lincoln!
> > > 
> > > Best,
> > > Junrui
> > > 
> > > Aleksandr Pilipenko z3d...@gmail.com 于2024年4月12日周五 18:29写道：
> > > 
> > > > > Congratulations, Lincoln!
> > > > > 
> > > > > Best Regards
> > > > > Aleksandr

Re: [ANNOUNCE] New Apache Flink PMC Member - Jing Ge

2024-04-12 Thread Ferenc Csaky

Congratulations, Jing!

Best,
Ferenc



On Friday, April 12th, 2024 at 13:54, Ron liu  wrote:

> 
> 
> Congratulations, Jing!
> 
> Best,
> Ron
> 
> Junrui Lee jrlee@gmail.com 于2024年4月12日周五 18:54写道：
> 
> > Congratulations, Jing!
> > 
> > Best,
> > Junrui
> > 
> > Aleksandr Pilipenko z3d...@gmail.com 于2024年4月12日周五 18:28写道：
> > 
> > > Congratulations, Jing!
> > > 
> > > Best Regards,
> > > Aleksandr

Re: [VOTE] FLIP-399: Flink Connector Doris

2024-04-09 Thread Ferenc Csaky

+1 (non-binding)

Best,
Ferenc




On Tuesday, April 9th, 2024 at 10:32, Ahmed Hamdy  wrote:

> 
> 
> Hi Wudi,
> 
> +1 (non-binding).
> 
> Best Regards
> Ahmed Hamdy
> 
> 
> On Tue, 9 Apr 2024 at 09:21, Yuepeng Pan panyuep...@apache.org wrote:
> 
> > Hi, Di.
> > 
> > Thank you for driving it !
> > 
> > +1 (non-binding).
> > 
> > Best,
> > Yuepeng Pan
> > 
> > On 2024/04/09 02:47:55 wudi wrote:
> > 
> > > Hi devs,
> > > 
> > > I would like to start a vote about FLIP-399 [1]. The FLIP is about
> > > contributing the Flink Doris Connector[2] to the Flink community.
> > > Discussion thread [3].
> > > 
> > > The vote will be open for at least 72 hours unless there is an objection
> > > or
> > > insufficient votes.
> > > 
> > > Thanks,
> > > Di.Wu
> > > 
> > > [1]
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris
> > > [2] https://github.com/apache/doris-flink-connector
> > > [3] https://lists.apache.org/thread/p3z4wsw3ftdyfs9p2wd7bbr2gfyl3xnh

Re: [DISCUSS] FLIP-438: Make Flink's Hadoop and YARN configuration probing consistent

2024-04-03 Thread Ferenc Csaky

Hi Venkata,

Thank you for opening the discussion about this!

After taking a look at the YARN and Hadoop configurations, the
reason why it was implemented this way is that, in case of YARN,
every YARN-specific property is prefixed with "yarn.", so to get
the final, YARN-side property it is enough to remove the "flink."
prefix.

In case of Hadoop, there are properties that not prefixed with
"hadoop.", e.g. "dfs.replication" so to identify and get the
Hadoop-side property it is necessary to duplicate the "hadoop" part
in the properties.

Taking this into consideration I would personally say -0 to this
change. IMO the current behavior can be justified as giving
slightly different solutions to slightly different problems, which
are well documented. Handling both prefixes would complicate the
parsing logic until the APIs can be removed, which as it looks at
the moment would only be possible in Flink 3.0, which probably will
not happen in the foreseeable future, so I do not see the benefit
of the added complexity.

Regarding the FLIP, in the "YARN configuration override example"
part, I think you should present an example that works correctly
at the moment: "flink.yarn.application.classpath" ->
"yarn.application.classpath".

Best,
Ferenc

On Friday, March 29th, 2024 at 23:45, Venkatakrishnan Sowrirajan 
 wrote:

> 
> 
> Hi Flink devs,
> 
> I would like to start a discussion on FLIP-XXX: Make Flink's Hadoop and
> YARN configuration probing consistent
> https://docs.google.com/document/d/1I2jBFI0eVkofAVCAEeajNQRfOqKGJsRfZd54h79AIYc/edit?usp=sharing.
> 
> This stems from an earlier discussion thread here
> https://lists.apache.org/thread/l2fh5shbf59fjgbt1h73pmmsqj038ppv.
> 
> 
> This FLIP is proposing to make the configuration probing behavior between
> Hadoop and YARN configuration to be consistent.
> 
> Regards
> Venkata krishnan

Re: [DISCUSS] FLIP-XXX: Introduce Flink SQL variables

2024-04-03 Thread Ferenc Csaky

Hi Jeyhun,

Thank you for your questions, please see my answers below.

> What is its impact on query optimization because resolving
> variables at the parsing stage might affect query optimization.

The approach I mentioned in the FLIP would not affect query
optimization, as it restricts variables to be literals, hence do
not support calculated variables. This means that the substitution
would be a simple string replace for the variables before the
actual parse happens on the already resolved statement.

Although this may not be the way to go, according to Yanfei's
previous comments I started to check on possible solutions for
calculated variables, which will probably change this answer, but
I will report back when I have something regarding this topic.

> What is the scope of variables? I mean when and how they override
> each other and when get out of their scopes?

This is a good question, I did not mention this in the FLIP. My
thinking on this topic is that a VariableStore is tied to a SQL
session, so variables are session scoped.

Having a system-wide scope might make sense. In that case, the
system-wide variable should be shadowed by the same session-wide
variable IMO, as a general rule regarding variable shadowing [1].
Although I did not include system-wide scope in my PoC, but this
would basically mean to maintain a specific system-wide
VariableStore.

> Does the proposal support dynamic assignment of the variables or
> the value of variables should be known at query compile time?

Covered this in my answer to the first Q.

> Can we somehow benefit from/leverage Calcite's parameterization
> feature in this proposal?

I am not super familiar with Calcite capabilities regarding this
topic and the Calcite docs were not really helpful either. But I
might looked over something, so can you elaborate more / point me
towards what you mean?

Best,
Ferenc

[1] https://en.wikipedia.org/wiki/Variable_shadowing

On Monday, April 1st, 2024 at 15:24, Jeyhun Karimov  
wrote:

> 
> 
> Hi Ferenc,
> 
> Thanks for the proposal. Sounds like a good idea!
> I have a few questions on that:
> 
> - What is its impact on query optimization because resolving variables at
> the parsing stage might affect query optimization.
> 
> - What is the scope of variables? I mean when and how they override each
> other and when get out of their scopes?
> 
> - Does the proposal support dynamic assignment of the variables or the
> value of variables should be known at query compile time?
> 
> - Can we somehow benefit from/leverage Calcite's parameterization feature
> in this proposal?
> 
> Regards,
> Jeyhun
> 
> On Thu, Mar 28, 2024 at 6:21 PM Ferenc Csaky ferenc.cs...@pm.me.invalid
> 
> wrote:
> 
> > Hi, Jim, Yanfei,
> > 
> > Thanks for your comments! Let me reply in the order of the
> > messages.
> > 
> > > I'd prefer sticking to the SQL standard if possible. Would
> > > it be possible / sensible to allow for each syntax, perhaps
> > > managed by a config setting?
> > 
> > Correct me if I am wrong, but AFAIK variables are not part of
> > the ANSI SQL standard. The '@' prefix is used by some widely
> > used DB mgr, e.g. MySQL.
> > 
> > Regarding having multiple resolution syntax, it would be possible,
> > if we agree it adds value. Personally I do not have a strong
> > opinion on that.
> > 
> > > I'm new to Flink SQL and I'm curious if these variables can be
> > > calculated from statements or expression [1]?
> > 
> > Good point! The proposed solution would lack this functionality.
> > On our platform, we have a working solution of this that was
> > sufficient to solve the main problem we had to carry SQL between
> > environments without change.
> > 
> > At this point, variable values can only be literals, and they are
> > automatically escaped during resolution. Except if they are
> > resolved as a DDL statement property value.
> > 
> > But if the community agrees that it would be useful to have the
> > ability of calculated variables I would happily spend some time
> > on possible solutions that makes sense in Flink.
> > 
> > WDYT?
> > 
> > Best,
> > Ferenc
> > 
> > On Thursday, March 28th, 2024 at 03:58, Yanfei Lei fredia...@gmail.com
> > wrote:
> > 
> > > Hi Ferenc,
> > > 
> > > Thanks for the proposal, using SQLvariables to exclude
> > > environment-specific configuration from code sounds like a good idea.
> > > 
> > > I'm new to Flink SQL and I'm curious if these variables can be
> > > calculated from statements or expression [1]? In FLIP, it seems that
> > > the values are in the form of Strin

Re: [DISCUSS] FLIP-XXX: Introduce Flink SQL variables

2024-03-28 Thread Ferenc Csaky

Hi, Jim, Yanfei,

Thanks for your comments! Let me reply in the order of the
messages.

> I'd prefer sticking to the SQL standard if possible.  Would
> it be possible / sensible to allow for each syntax, perhaps
> managed by a config setting?

Correct me if I am wrong, but AFAIK variables are not part of
the ANSI SQL standard. The '@' prefix is used by some widely
used DB mgr, e.g. MySQL.

Regarding having multiple resolution syntax, it would be possible,
if we agree it adds value. Personally I do not have a strong
opinion on that.


> I'm new to Flink SQL and I'm curious if these variables can be
> calculated from statements or expression [1]?

Good point! The proposed solution would lack this functionality.
On our platform, we have a working solution of this that was
sufficient to solve the main problem we had to carry SQL between
environments without change.

At this point, variable values can only be literals, and they are
automatically escaped during resolution. Except if they are
resolved as a DDL statement property value.

But if the community agrees that it would be useful to have the
ability of calculated variables I would happily spend some time
on possible solutions that makes sense in Flink.

WDYT?

Best,
Ferenc



On Thursday, March 28th, 2024 at 03:58, Yanfei Lei  wrote:

> 
> 
> Hi Ferenc,
> 
> Thanks for the proposal, using SQLvariables to exclude
> environment-specific configuration from code sounds like a good idea.
> 
> I'm new to Flink SQL and I'm curious if these variables can be
> calculated from statements or expression [1]? In FLIP, it seems that
> the values are in the form of StringLiteral.
> 
> 
> [1] 
> https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-aux-set-variable.html
> 
> Jim Hughes jhug...@confluent.io.invalid 于2024年3月28日周四 04:54写道：
> 
> > Hi Ferenc,
> > 
> > Looks like a good idea.
> > 
> > I'd prefer sticking to the SQL standard if possible. Would it be possible
> > / sensible to allow for each syntax, perhaps managed by a config setting?
> > 
> > Cheers,
> > 
> > Jim
> > 
> > On Tue, Mar 26, 2024 at 6:59 AM Ferenc Csaky ferenc.cs...@pm.me.invalid
> > wrote:
> > 
> > > Hello devs,
> > > 
> > > I would like to start a discussion about FLIP-XXX: Introduce Flink SQL
> > > variables [1].
> > > 
> > > The main motivation behing this change is to be able to abstract Flink SQL
> > > from
> > > environment-specific configuration and provide a way to carry jobs between
> > > environments (e.g. dev-stage-prod) without the need to make changes in the
> > > code.
> > > It can also be a way to decouple sensitive information from the job code,
> > > or help
> > > with redundant literals.
> > > 
> > > The main decision regarding the proposed solution is to handle the
> > > variable resolution
> > > as early as possible on the given string statement, so the whole operation
> > > is an easy and
> > > lightweight string replace. But this approach introduces some limitations
> > > as well:
> > > 
> > > - The executed SQL will always be the unresolved, raw string, so in case
> > > of secrets
> > > a DESC operation would show them.
> > > - Changing the value of a variable can break code that uses that variable.
> > > 
> > > For more details, please check the FLIP [1]. There is also a stale Jira
> > > about this [2].
> > > 
> > > Looking forward to any comments and opinions!
> > > 
> > > Thanks,
> > > Ferenc
> > > 
> > > [1]
> > > https://docs.google.com/document/d/1-eUz-PBCdqNggG_irDT0X7fdL61ysuHOaWnrkZHb5Hc/edit?usp=sharing
> > > [2] https://issues.apache.org/jira/browse/FLINK-17377
> 
> 
> 
> 
> --
> Best,
> Yanfei

[DISCUSS] FLIP-XXX: Introduce Flink SQL variables

2024-03-26 Thread Ferenc Csaky

Hello devs,

I would like to start a discussion about FLIP-XXX: Introduce Flink SQL 
variables [1].

The main motivation behing this change is to be able to abstract Flink SQL from
environment-specific configuration and provide a way to carry jobs between
environments (e.g. dev-stage-prod) without the need to make changes in the code.
It can also be a way to decouple sensitive information from the job code, or 
help
with redundant literals.

The main decision regarding the proposed solution is to handle the variable 
resolution
as early as possible on the given string statement, so the whole operation is 
an easy and
lightweight string replace. But this approach introduces some limitations as 
well:

- The executed SQL will always be the unresolved, raw string, so in case of 
secrets
a DESC operation would show them.
- Changing the value of a variable can break code that uses that variable.

For more details, please check the FLIP [1]. There is also a stale Jira about 
this [2].

Looking forward to any comments and opinions!

Thanks,
Ferenc

[1] 
https://docs.google.com/document/d/1-eUz-PBCdqNggG_irDT0X7fdL61ysuHOaWnrkZHb5Hc/edit?usp=sharing
[2] https://issues.apache.org/jira/browse/FLINK-17377

Re: [DISCUSS] Flink Website Menu Adjustment

2024-03-25 Thread Ferenc Csaky

Suggested changes makes sense, +1 for the proposed menus and order.

Best,
Ferenc




On Monday, March 25th, 2024 at 14:50, Gyula Fóra  wrote:

> 
> 
> +1 for the proposal
> 
> Gyula
> 
> On Mon, Mar 25, 2024 at 12:49 PM Leonard Xu xbjt...@gmail.com wrote:
> 
> > Thanks Zhongqiang for starting this discussion, updating documentation
> > menus according to sub-projects' activities makes sense to me.
> > 
> > +1 for the proposed menus:
> > 
> > > After:
> > > 
> > > With Flink
> > > With Flink Kubernetes Operator
> > > With Flink CDC
> > > With Flink ML
> > > With Flink Stateful Functions
> > > Training Course
> > 
> > Best,
> > Leonard
> > 
> > > 2024年3月25日 下午3:48，gongzhongqiang gongzhongqi...@apache.org 写道：
> > > 
> > > Hi everyone,
> > > 
> > > I'd like to start a discussion on adjusting the Flink website [1] menu to
> > > improve accuracy and usability.While migrating Flink CDC documentation
> > > to the website, I found outdated links, need to review and update menus
> > > for the most relevant information for our users.
> > > 
> > > Proposal:
> > > 
> > > - Remove Paimon [2] from the "Getting Started" and "Documentation" menus:
> > > Paimon [2] is now an independent top project of ASF. CC： jingsong lees
> > > 
> > > - Sort the projects in the subdirectory by the activity of the projects.
> > > Here I list the number of releases for each project in the past year.
> > > 
> > > Flink Kubernetes Operator : 7
> > > Flink CDC : 5
> > > Flink ML : 2
> > > Flink Stateful Functions : 1
> > > 
> > > Expected Outcome :
> > > 
> > > - Menu "Getting Started"
> > > 
> > > Before:
> > > 
> > > With Flink
> > > 
> > > With Flink Stateful Functions
> > > 
> > > With Flink ML
> > > 
> > > With Flink Kubernetes Operator
> > > 
> > > With Paimon(incubating) (formerly Flink Table Store)
> > > 
> > > With Flink CDC
> > > 
> > > Training Course
> > > 
> > > After:
> > > 
> > > With Flink
> > > With Flink Kubernetes Operator
> > > 
> > > With Flink CDC
> > > 
> > > With Flink ML
> > > 
> > > With Flink Stateful Functions
> > > 
> > > Training Course
> > > 
> > > - Menu "Documentation" will same with "Getting Started"
> > > 
> > > I look forward to hearing your thoughts and suggestions on this proposal.
> > > 
> > > [1] https://flink.apache.org/
> > > [2] https://github.com/apache/incubator-paimon
> > > [3] https://github.com/apache/flink-statefun
> > > 
> > > Best regards,
> > > 
> > > Zhongqiang Gong

[jira] [Created] (FLINK-34931) Update Kudu connector DataStream Source/Sink implementation

2024-03-25 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-34931:


 Summary: Update Kudu connector DataStream Source/Sink 
implementation
 Key: FLINK-34931
 URL: https://issues.apache.org/jira/browse/FLINK-34931
 Project: Flink
  Issue Type: Sub-task
Reporter: Ferenc Csaky


Update the DataSource API classes to use the current interfaces.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34930) Move existing Kudu connector code from Bahir repo to dedicated repo

2024-03-25 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-34930:


 Summary: Move existing Kudu connector code from Bahir repo to 
dedicated repo
 Key: FLINK-34930
 URL: https://issues.apache.org/jira/browse/FLINK-34930
 Project: Flink
  Issue Type: Sub-task
Reporter: Ferenc Csaky






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34929) Create "flink-connector-kudu" repository

2024-03-25 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-34929:


 Summary: Create "flink-connector-kudu" repository
 Key: FLINK-34929
 URL: https://issues.apache.org/jira/browse/FLINK-34929
 Project: Flink
  Issue Type: Sub-task
Reporter: Ferenc Csaky


We should create a "flink-connector-kudu" repositry under the "apache" GitHub 
organization.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34928) FLIP-439: Externalize Kudu Connector from Bahir

2024-03-25 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-34928:


 Summary: FLIP-439: Externalize Kudu Connector from Bahir
 Key: FLINK-34928
 URL: https://issues.apache.org/jira/browse/FLINK-34928
 Project: Flink
  Issue Type: Improvement
Reporter: Ferenc Csaky


Umbrella issue for: 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[RESULT][VOTE] FLIP-439: Externalize Kudu Connector from Bahir

2024-03-25 Thread Ferenc Csaky

Hi everyone,

I'm happy to announce that FLIP-439: Externalize Kudu Connector from Bahir [1]
has been accepted with 12 approving votes, 6 of which are binding: [2]:

- Mate Czagany (non-binding)
- Gyula Fóra (binding)
- Gabor Somogyi (binding)
- Mátyás Őrhidi (binding)
- Hang Ruan (non-binding)
- Samrat Deb (non-binding)
- Gong Zhongqiang(non-binding)
- Martijn Visser (binding)
- Leonard Xu Jin (binding)
- Marton Balassi (binding)
- Jeyhun Karimov (non-binding)
- Yuepeng Pan (non-binding)

There are no disapproving votes. Thanks to everyone who participated in the
discussion and voting.

Best,
Ferenc

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir
[2] https://lists.apache.org/thread/cjv0lq7x3m98dbhjk4p2wjps5rk2l9kj

[VOTE] FLIP-439: Externalize Kudu Connector from Bahir

2024-03-20 Thread Ferenc Csaky

Hello devs,

I would like to start a vote about FLIP-439 [1]. The FLIP is about to 
externalize the Kudu
connector from the recently retired Apache Bahir project [2] to keep it 
maintainable and
make it up to date as well. Discussion thread [3].

The vote will be open for at least 72 hours (until 2024 March 23 14:03 UTC) 
unless there
are any objections or insufficient votes.

Thanks,
Ferenc

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir
[2] https://attic.apache.org/projects/bahir.html
[3] https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz

Re: [VOTE] FLIP-402: Extend ZooKeeper Curator configurations

2024-03-20 Thread Ferenc Csaky

+1 (non-binding), thanks for driving this!

Best,
Ferenc


On Wednesday, March 20th, 2024 at 10:57, Yang Wang  
wrote:

> 
> 
> +1 (binding) since ZK HA is still widely used.
> 
> 
> Best,
> Yang
> 
> On Thu, Mar 14, 2024 at 6:27 PM Matthias Pohl
> matthias.p...@aiven.io.invalid wrote:
> 
> > Nothing to add from my side. Thanks, Alex.
> > 
> > +1 (binding)
> > 
> > On Thu, Mar 7, 2024 at 4:09 PM Alex Nitavsky alexnitav...@gmail.com
> > wrote:
> > 
> > > Hi everyone,
> > > 
> > > I'd like to start a vote on FLIP-402 [1]. It introduces new configuration
> > > options for Apache Flink's ZooKeeper integration for high availability by
> > > reflecting existing Apache Curator configuration options. It has been
> > > discussed in this thread [2].
> > > 
> > > I would like to start a vote. The vote will be open for at least 72
> > > hours
> > > (until March 10th 18:00 GMT) unless there is an objection or
> > > insufficient votes.
> > > 
> > > [1]
> > 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-402%3A+Extend+ZooKeeper+Curator+configurations
> > 
> > > [2] https://lists.apache.org/thread/gqgs2jlq6bmg211gqtgdn8q5hp5v9l1z
> > > 
> > > Thanks
> > > Alex

Re: [VOTE] FLIP-436: Introduce Catalog-related Syntax

2024-03-19 Thread Ferenc Csaky

+1 (non-binding).

Best,
Ferenc




On Tuesday, March 19th, 2024 at 12:39, Jark Wu  wrote:

> 
> 
> +1 (binding)
> 
> Best,
> Jark
> 
> On Tue, 19 Mar 2024 at 19:05, Yuepeng Pan panyuep...@apache.org wrote:
> 
> > Hi, Yubin
> > 
> > Thanks for driving it !
> > 
> > +1 non-binding.
> > 
> > Best,
> > Yuepeng Pan.
> > 
> > At 2024-03-19 17:56:42, "Yubin Li" lyb5...@gmail.com wrote:
> > 
> > > Hi everyone,
> > > 
> > > Thanks for all the feedback, I'd like to start a vote on the FLIP-436:
> > > Introduce Catalog-related Syntax [1]. The discussion thread is here
> > > [2].
> > > 
> > > The vote will be open for at least 72 hours unless there is an
> > > objection or insufficient votes.
> > > 
> > > [1]
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-436%3A+Introduce+Catalog-related+Syntax
> > > [2] https://lists.apache.org/thread/10k1bjb4sngyjwhmfqfky28lyoo7sv0z
> > > 
> > > Best regards,
> > > Yubin

Re: [DISCUSS] FLIP Suggestion: Externalize Kudu Connector from Bahir

2024-03-19 Thread Ferenc Csaky

Hi,

since there are no more comments for a while, if there are no more comments for 
another day, I will start a vote thread.

Thanks,
Ferenc


On Thursday, March 14th, 2024 at 11:20, Ferenc Csaky 
 wrote:

> 
> 
> Hi,
> 
> Gentle ping to see if there are any other concerns or things that seems 
> missing from the FLIP.
> 
> Best,
> Ferenc
> 
> 
> 
> 
> On Monday, March 11th, 2024 at 11:11, Ferenc Csaky ferenc.cs...@pm.me.INVALID 
> wrote:
> 
> > Hi Jing,
> > 
> > Thank you for your comments! Updated the FLIP with reasoning on the 
> > proposed release versions and included them in the headline "Release" field.
> > 
> > Best,
> > Ferenc
> > 
> > On Sunday, March 10th, 2024 at 16:59, Jing Ge j...@ververica.com.INVALID 
> > wrote:
> > 
> > > Hi Ferenc,
> > > 
> > > Thanks for the proposal! +1 for it!
> > > 
> > > Similar to what Leonard mentioned. I would suggest:
> > > 1. Use the "release" to define the release version of the Kudu connector
> > > itself.
> > > 2. Optionally, add one more row underneath to describe which Flink 
> > > versions
> > > this release will be compatible with, e.g. 1.17, 1.18. I think it makes
> > > sense to support at least two last Flink releases. An example could be
> > > found at [1]
> > > 
> > > Best regards,
> > > Jing
> > > 
> > > [1] https://lists.apache.org/thread/jcjfy3fgpg5cdnb9noslq2c77h0gtcwp
> > > 
> > > On Sun, Mar 10, 2024 at 3:46 PM Yanquan Lv decq12y...@gmail.com wrote:
> > > 
> > > > Hi Ferenc, +1 for this FLIP.
> > > > 
> > > > Ferenc Csaky ferenc.cs...@pm.me.invalid 于2024年3月9日周六 01:49写道：
> > > > 
> > > > > Thank you Jeyhun, Leonard, and Hang for your comments! Let me
> > > > > address them from earliest to latest.
> > > > > 
> > > > > > How do you plan the review process in this case (e.g. incremental
> > > > > > over existing codebase or cumulative all at once) ?
> > > > > 
> > > > > I think incremental would be less time consuming and complex for
> > > > > reviewers so I would leaning towards that direction. I would
> > > > > imagine multiple subtasks for migrating the existing code, and
> > > > > updating the deprecated interfaces, so those should be separate PRs 
> > > > > and
> > > > > the release can be initiated when everything is merged.
> > > > > 
> > > > > > (1) About the release version, could you specify kudu connector 
> > > > > > version
> > > > > > instead of flink version 1.18 as external connector version is 
> > > > > > different
> > > > > > with flink?
> > > > > > (2) About the connector config options, could you enumerate these
> > > > > > options so that we can review they’re reasonable or not?
> > > > > 
> > > > > I added these to the FLIP, copied the current configs options as is,
> > > > > PTAL.
> > > > > 
> > > > > > (3) Metrics is also key part of connector, could you add the 
> > > > > > supported
> > > > > > connector metrics to public interface as well?
> > > > > 
> > > > > The current Bahir conenctor code does not include any metrics and I 
> > > > > did
> > > > > not plan to include them into the scope of this FLIP.
> > > > > 
> > > > > > I think that how to state this code originally lived in Bahir may 
> > > > > > be in
> > > > > > the
> > > > > > FLIP.
> > > > > 
> > > > > I might miss your point, but the FLIP contains this: "Migrating the
> > > > > current code keeping the history and noting it explicitly it was 
> > > > > forked
> > > > > from the Bahir repository [2]." Pls. share if you meant something 
> > > > > else.
> > > > > 
> > > > > Best,
> > > > > Ferenc
> > > > > 
> > > > > On Friday, March 8th, 2024 at 10:42, Hang Ruan ruanhang1...@gmail.com
> > > > > wrote:
> > > > > 
> > > > > > Hi, Ferenc.
> > > > > > 
> > > > > > Thanks for the FLIP discussion. +1 for the proposal.
> > > > > > I think that how to

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Ferenc Csaky

 +1 (non-binding)

- Verified checksum and signature
- Verified no binary in src
- Built from src
- Reviewed release note PR
- Reviewed web PR
- Tested a simple datagen query and insert to blackhole sink via SQL Gateway

Best,
Ferenc




On Thursday, March 14th, 2024 at 12:14, Jane Chan  wrote:

> 
> 
> Hi Lincoln,
> 
> Thank you for the prompt response and the effort to provide clarity on this
> matter.
> 
> Best,
> Jane
> 
> On Thu, Mar 14, 2024 at 6:02 PM Lincoln Lee lincoln.8...@gmail.com wrote:
> 
> > Hi Jane,
> > 
> > Thank you for raising this question. I saw the discussion in the Jira
> > (include Matthias' point)
> > and sought advice from several PMCs (including the previous RMs), the
> > majority of people
> > are in favor of merging the bugfix into the release branch even during the
> > release candidate
> > (RC) voting period, so we should accept all bugfixes (unless there is a
> > specific community
> > rule preventing it).
> > 
> > Thanks again for contributing to the community!
> > 
> > Best,
> > Lincoln Lee
> > 
> > Matthias Pohl matthias.p...@aiven.io.invalid 于2024年3月14日周四 17:50写道：
> > 
> > > Update on FLINK-34227 [1] which I mentioned above: Chesnay helped
> > > identify
> > > a concurrency issue in the JobMaster shutdown logic which seems to be in
> > > the code for quite some time. I created a PR fixing the issue hoping that
> > > the test instability is resolved with it.
> > > 
> > > The concurrency issue doesn't really explain why it only started to
> > > appear
> > > recently in a specific CI setup (GHA with AdaptiveScheduler). There is no
> > > hint in the git history indicating that it's caused by some newly
> > > introduced change. That is why I wouldn't make FLINK-34227 a reason to
> > > cancel rc2. Instead, the fix can be provided in subsequent patch
> > > releases.
> > > 
> > > Matthias
> > > 
> > > [1] https://issues.apache.org/jira/browse/FLINK-34227
> > > 
> > > On Thu, Mar 14, 2024 at 8:49 AM Jane Chan qingyue@gmail.com wrote:
> > > 
> > > > Hi Yun, Jing, Martijn and Lincoln,
> > > > 
> > > > I'm seeking guidance on whether merging the bugfix[1][2] at this stage
> > > > is
> > > > appropriate. I want to ensure that the actions align with the current
> > > > release process and do not disrupt the ongoing preparations.
> > > > 
> > > > [1] https://issues.apache.org/jira/browse/FLINK-29114
> > > > [2] https://github.com/apache/flink/pull/24492
> > > > 
> > > > Best,
> > > > Jane
> > > > 
> > > > On Thu, Mar 14, 2024 at 1:33 PM Yun Tang myas...@live.com wrote:
> > > > 
> > > > > +1 (non-binding)
> > > > > 
> > > > > *
> > > > > Verified the signature and checksum.
> > > > > *
> > > > > Reviewed the release note PR
> > > > > *
> > > > > Reviewed the web announcement PR
> > > > > *
> > > > > Start a standalone cluster to submit the state machine example, which
> > > > > works well.
> > > > > *
> > > > > Checked the pre-built jars are generated via JDK8
> > > > > *
> > > > > Verified the process profiler works well after setting
> > > > > rest.profiling.enabled: true
> > > > > 
> > > > > Best
> > > > > Yun Tang
> > > > > 
> > > > > 
> > > > > From: Qingsheng Ren re...@apache.org
> > > > > Sent: Wednesday, March 13, 2024 12:45
> > > > > To: dev@flink.apache.org dev@flink.apache.org
> > > > > Subject: Re: [VOTE] Release 1.19.0, release candidate #2
> > > > > 
> > > > > +1 (binding)
> > > > > 
> > > > > - Verified signature and checksum
> > > > > - Verified no binary in source
> > > > > - Built from source
> > > > > - Tested reading and writing Kafka with SQL client and Kafka
> > > > > connector
> > > > > 3.1.0
> > > > > - Verified source code tag
> > > > > - Reviewed release note
> > > > > - Reviewed web PR
> > > > > 
> > > > > Thanks to all release managers and contributors for the awesome work!
> > > > > 
> > > > > Best,
> > > > > Qingsheng
> > > > > 
> > > > > On Wed, Mar 13, 2024 at 1:23 AM Matthias Pohl
> > > > > matthias.p...@aiven.io.invalid wrote:
> > > > > 
> > > > > > I want to share an update on FLINK-34227 [1]: It's still not clear
> > > > > > what's
> > > > > > causing the test instability. So far, we agreed in today's release
> > > > > > sync
> > > > > > [2]
> > > > > > that it's not considered a blocker because it is observed in 1.18
> > > > > > nightly
> > > > > > builds and it only appears in the GitHub Actions workflow. But I
> > > > > > still
> > > > > > have
> > > > > > a bit of a concern that this is something that was introduced in
> > > > > > 1.19
> > > > > > and
> > > > > > backported to 1.18 after the 1.18.1 release (because the test
> > > > > > instability
> > > > > > started to appear more regularly in March; with one occurrence in
> > > > > > January).
> > > > > > Additionally, I have no reason to believe, yet, that the
> > > > > > instability
> > > > > > is
> > > > > > caused by some GHA-related infrastructure issue.
> > > > > > 
> > > > > > So, if someone else has some capacity to help looking into it; that
> > > >

Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-14 Thread Ferenc Csaky

Hi Yubin,

Thank you for initiating this discussion! +1 for the proposal.

I also think it makes sense to group the missing catalog related
SQL syntaxes under this FLIP.

Looking forward to these features!

Best,
Ferenc




On Thursday, March 14th, 2024 at 08:31, Jane Chan  wrote:

> 
> 
> Hi Yubin,
> 
> Thanks for leading the discussion. I'm +1 for the FLIP.
> 
> As Jark said, it's a good opportunity to enhance the syntax for Catalog
> from a more comprehensive perspective. So, I suggest expanding the scope of
> this FLIP by focusing on the mechanism instead of one use case to enhance
> the overall functionality. WDYT?
> 
> Best,
> Jane
> 
> On Thu, Mar 14, 2024 at 11:38 AM Hang Ruan ruanhang1...@gmail.com wrote:
> 
> > Hi, Yubin.
> > 
> > Thanks for the FLIP. +1 for it.
> > 
> > Best,
> > Hang
> > 
> > Yubin Li lyb5...@gmail.com 于2024年3月14日周四 10:15写道：
> > 
> > > Hi Jingsong, Feng, and Jeyhun
> > > 
> > > Thanks for your support and feedback!
> > > 
> > > > However, could we add a new method `getCatalogDescriptor()` to
> > > > CatalogManager instead of directly exposing CatalogStore?
> > > 
> > > Good point, Besides the audit tracking issue, The proposed feature
> > > only requires `getCatalogDescriptor()` function. Exposing components
> > > with excessive functionality will bring unnecessary risks, I have made
> > > modifications in the FLIP doc [1]. Thank Feng :)
> > > 
> > > > Showing the SQL parser implementation in the FLIP for the SQL syntax
> > > > might be a bit confusing. Also, the formal definition is missing for
> > > > this SQL clause.
> > > 
> > > Thank Jeyhun for pointing it out :) I have updated the doc [1] .
> > > 
> > > [1]
> > 
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=296290756
> > 
> > > Best,
> > > Yubin
> > > 
> > > On Thu, Mar 14, 2024 at 2:18 AM Jeyhun Karimov je.kari...@gmail.com
> > > wrote:
> > > 
> > > > Hi Yubin,
> > > > 
> > > > Thanks for the proposal. +1 for it.
> > > > I have one comment:
> > > > 
> > > > I would like to see the SQL syntax for the proposed statement. Showing
> > > > the
> > > > SQL parser implementation in the FLIP
> > > > for the SQL syntax might be a bit confusing. Also, the formal
> > > > definition
> > > > is
> > > > missing for this SQL clause.
> > > > Maybe something like [1] might be useful. WDYT?
> > > > 
> > > > Regards,
> > > > Jeyhun
> > > > 
> > > > [1]
> > 
> > https://github.com/apache/flink/blob/0da60ca1a4754f858cf7c52dd4f0c97ae0e1b0cb/docs/content/docs/dev/table/sql/show.md?plain=1#L620-L632
> > 
> > > > On Wed, Mar 13, 2024 at 3:28 PM Feng Jin jinfeng1...@gmail.com
> > > > wrote:
> > > > 
> > > > > Hi Yubin
> > > > > 
> > > > > Thank you for initiating this FLIP.
> > > > > 
> > > > > I have just one minor question:
> > > > > 
> > > > > I noticed that we added a new function `getCatalogStore` to expose
> > > > > CatalogStore, and it seems fine.
> > > > > However, could we add a new method `getCatalogDescriptor()` to
> > > > > CatalogManager instead of directly exposing CatalogStore?
> > > > > By only providing the `getCatalogDescriptor()` interface, it may be
> > > > > easier
> > > > > for us to implement audit tracking in CatalogManager in the future.
> > > > > WDYT ?
> > > > > Although we have only collected some modified events at the
> > > > > moment.[1]
> > > > > 
> > > > > [1].
> > 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Catalog+Modification+Listener
> > 
> > > > > Best,
> > > > > Feng
> > > > > 
> > > > > On Wed, Mar 13, 2024 at 5:31 PM Jingsong Li jingsongl...@gmail.com
> > > > > wrote:
> > > > > 
> > > > > > +1 for this.
> > > > > > 
> > > > > > We are missing a series of catalog related syntaxes.
> > > > > > Especially after the introduction of catalog store. [1]
> > > > > > 
> > > > > > [1]
> > 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
> > 
> > > > > > Best,
> > > > > > Jingsong
> > > > > > 
> > > > > > On Wed, Mar 13, 2024 at 5:09 PM Yubin Li lyb5...@gmail.com
> > > > > > wrote:
> > > > > > 
> > > > > > > Hi devs,
> > > > > > > 
> > > > > > > I'd like to start a discussion about FLIP-436: Introduce "SHOW
> > > > > > > CREATE
> > > > > > > CATALOG" Syntax [1].
> > > > > > > 
> > > > > > > At present, the `SHOW CREATE TABLE` statement provides strong
> > > > > > > support
> > > > > > > for
> > > > > > > users to easily
> > > > > > > reuse created tables. However, despite the increasing importance
> > > > > > > of the
> > > > > > > `Catalog` in user's
> > > > > > > business, there is no similar statement for users to use.
> > > > > > > 
> > > > > > > According to the online discussion in FLINK-24939 [2] with Jark
> > > > > > > Wu
> > > > > > > and
> > > > > > > Feng
> > > > > > > Jin, since `CatalogStore`
> > > > > > > has been introduced in FLIP-295 [3], we could use this component
> > > > > > > to
> > > > > > > implement such a long-awaited
> > > > > >

Re: [DISCUSS] FLIP Suggestion: Externalize Kudu Connector from Bahir

2024-03-14 Thread Ferenc Csaky

Hi,

Gentle ping to see if there are any other concerns or things that seems missing 
from the FLIP.

Best,
Ferenc




On Monday, March 11th, 2024 at 11:11, Ferenc Csaky  
wrote:

> 
> 
> Hi Jing,
> 
> Thank you for your comments! Updated the FLIP with reasoning on the proposed 
> release versions and included them in the headline "Release" field.
> 
> Best,
> Ferenc
> 
> 
> 
> 
> On Sunday, March 10th, 2024 at 16:59, Jing Ge j...@ververica.com.INVALID 
> wrote:
> 
> > Hi Ferenc,
> > 
> > Thanks for the proposal! +1 for it!
> > 
> > Similar to what Leonard mentioned. I would suggest:
> > 1. Use the "release" to define the release version of the Kudu connector
> > itself.
> > 2. Optionally, add one more row underneath to describe which Flink versions
> > this release will be compatible with, e.g. 1.17, 1.18. I think it makes
> > sense to support at least two last Flink releases. An example could be
> > found at [1]
> > 
> > Best regards,
> > Jing
> > 
> > [1] https://lists.apache.org/thread/jcjfy3fgpg5cdnb9noslq2c77h0gtcwp
> > 
> > On Sun, Mar 10, 2024 at 3:46 PM Yanquan Lv decq12y...@gmail.com wrote:
> > 
> > > Hi Ferenc, +1 for this FLIP.
> > > 
> > > Ferenc Csaky ferenc.cs...@pm.me.invalid 于2024年3月9日周六 01:49写道：
> > > 
> > > > Thank you Jeyhun, Leonard, and Hang for your comments! Let me
> > > > address them from earliest to latest.
> > > > 
> > > > > How do you plan the review process in this case (e.g. incremental
> > > > > over existing codebase or cumulative all at once) ?
> > > > 
> > > > I think incremental would be less time consuming and complex for
> > > > reviewers so I would leaning towards that direction. I would
> > > > imagine multiple subtasks for migrating the existing code, and
> > > > updating the deprecated interfaces, so those should be separate PRs and
> > > > the release can be initiated when everything is merged.
> > > > 
> > > > > (1) About the release version, could you specify kudu connector 
> > > > > version
> > > > > instead of flink version 1.18 as external connector version is 
> > > > > different
> > > > > with flink?
> > > > > (2) About the connector config options, could you enumerate these
> > > > > options so that we can review they’re reasonable or not?
> > > > 
> > > > I added these to the FLIP, copied the current configs options as is,
> > > > PTAL.
> > > > 
> > > > > (3) Metrics is also key part of connector, could you add the supported
> > > > > connector metrics to public interface as well?
> > > > 
> > > > The current Bahir conenctor code does not include any metrics and I did
> > > > not plan to include them into the scope of this FLIP.
> > > > 
> > > > > I think that how to state this code originally lived in Bahir may be 
> > > > > in
> > > > > the
> > > > > FLIP.
> > > > 
> > > > I might miss your point, but the FLIP contains this: "Migrating the
> > > > current code keeping the history and noting it explicitly it was forked
> > > > from the Bahir repository [2]." Pls. share if you meant something else.
> > > > 
> > > > Best,
> > > > Ferenc
> > > > 
> > > > On Friday, March 8th, 2024 at 10:42, Hang Ruan ruanhang1...@gmail.com
> > > > wrote:
> > > > 
> > > > > Hi, Ferenc.
> > > > > 
> > > > > Thanks for the FLIP discussion. +1 for the proposal.
> > > > > I think that how to state this code originally lived in Bahir may be 
> > > > > in
> > > > > the
> > > > > FLIP.
> > > > > 
> > > > > Best,
> > > > > Hang
> > > > > 
> > > > > Leonard Xu xbjt...@gmail.com 于2024年3月7日周四 14:14写道：
> > > > > 
> > > > > > Thanks Ferenc for kicking off this discussion, I left some comments
> > > > > > here:
> > > > > > 
> > > > > > (1) About the release version, could you specify kudu connector
> > > > > > version
> > > > > > instead of flink version 1.18 as external connector version is
> > > > > > different
> > > > > > w

Re: [DISCUSS] FLIP-399: Flink Connector Doris

2024-03-11 Thread Ferenc Csaky

Hi,

Thanks for driving this, +1 for the FLIP.

Best,
Ferenc




On Monday, March 11th, 2024 at 15:17, Ahmed Hamdy  wrote:

> 
> 
> Hello,
> Thanks for the proposal, +1 for the FLIP.
> 
> Best Regards
> Ahmed Hamdy
> 
> 
> On Mon, 11 Mar 2024 at 15:12, wudi 676366...@qq.com.invalid wrote:
> 
> > Hi, Leonard
> > Thank you for your suggestion.
> > I referred to other Connectors[1], modified the naming and types of
> > relevant parameters[2], and also updated FLIP.
> > 
> > [1]
> > https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/table/overview/
> > [1]
> > https://github.com/apache/doris-flink-connector/blob/master/flink-doris-connector/src/main/java/org/apache/doris/flink/table/DorisConfigOptions.java
> > 
> > Brs,
> > di.wu
> > 
> > > 2024年3月7日 14:33，Leonard Xu xbjt...@gmail.com 写道：
> > > 
> > > Thanks wudi for the updating, the FLIP generally looks good to me, I
> > > only left two minor suggestions:
> > > 
> > > (1) The suffix `.s` in configoption doris.request.query.timeout.s looks
> > > strange to me, could we change all time interval related option value type
> > > to Duration ?
> > > 
> > > (2) Could you check and improve all config options like
> > > `doris.exec.mem.limit` to make them to follow flink config option naming
> > > and value type?
> > > 
> > > Best,
> > > Leonard
> > > 
> > > > > 2024年3月6日 06:12，Jing Ge j...@ververica.com.INVALID 写道：
> > > > > 
> > > > > Hi Di,
> > > > > 
> > > > > Thanks for your proposal. +1 for the contribution. I'd like to know
> > > > > your
> > > > > thoughts about the following questions:
> > > > > 
> > > > > 1. According to your clarification of the exactly-once, thanks for it
> > > > > BTW,
> > > > > no PreCommitTopology is required. Does it make sense to let
> > > > > DorisSink[1]
> > > > > implement SupportsCommitter, since the TwoPhaseCommittingSink is
> > > > > deprecated[2] before turning the Doris connector into a Flink
> > > > > connector?
> > > > > 2. OLAP engines are commonly used as the tail/downstream of a data
> > > > > pipeline
> > > > > to support further e.g. ad-hoc query or cube with feasible
> > > > > pre-aggregation.
> > > > > Just out of curiosity, would you like to share some real use cases 
> > > > > that
> > > > > will use OLAP engines as the source of a streaming data pipeline? Or 
> > > > > it
> > > > > will only be used as the source for the batch?
> > > > > 3. The E2E test only covered sink[3], if I am not mistaken. Would you
> > > > > like
> > > > > to test the source in E2E too?
> > > > > 
> > > > > [1]
> > 
> > https://github.com/apache/doris-flink-connector/blob/43e0e5cf9b832854ea228fb093077872e3a311b6/flink-doris-connector/src/main/java/org/apache/doris/flink/sink/DorisSink.java#L55
> > 
> > > > > [2]
> > 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Enhance+and+synchronize+Sink+API+to+match+the+Source+API
> > 
> > > > > [3]
> > 
> > https://github.com/apache/doris-flink-connector/blob/43e0e5cf9b832854ea228fb093077872e3a311b6/flink-doris-connector/src/test/java/org/apache/doris/flink/tools/cdc/MySQLDorisE2ECase.java#L96
> > 
> > > > > Best regards,
> > > > > Jing
> > > > > 
> > > > > On Tue, Mar 5, 2024 at 11:18 AM wudi 676366...@qq.com.invalid wrote:
> > > > > 
> > > > > > Hi, Jeyhun Karimov.
> > > > > > Thanks for your question.
> > > > > > 
> > > > > > - How to ensure Exactly-Once?
> > > > > > 1. When the Checkpoint Barrier arrives, DorisSink will trigger the
> > > > > > precommit api of StreamLoad to complete the persistence of data in
> > > > > > Doris
> > > > > > (the data will not be visible at this time), and will also pass this
> > > > > > TxnID
> > > > > > to the Committer.
> > > > > > 2. When this Checkpoint of the entire Job is completed, the 
> > > > > > Committer
> > > > > > will
> > > > > > call the commit api of StreamLoad and commit TxnID to complete the
> > > > > > visibility of the transaction.
> > > > > > 3. When the task is restarted, the Txn with successful precommit and
> > > > > > failed commit will be aborted based on the label-prefix, and Doris'
> > > > > > abort
> > > > > > API will be called. (At the same time, Doris will also abort
> > > > > > transactions
> > > > > > that have not been committed for a long time)
> > > > > > 
> > > > > > ps: At the same time, this part of the content has been updated in
> > > > > > FLIP
> > > > > > 
> > > > > > - Because the default table model in Doris is Duplicate (
> > > > > > https://doris.apache.org/docs/data-table/data-model/), which does 
> > > > > > not
> > > > > > have a primary key, batch writing may cause data duplication, but
> > > > > > UNIQ The
> > > > > > model has a primary key, which ensures the idempotence of writing,
> > > > > > thus
> > > > > > achieving Exactly-Once
> > > > > > 
> > > > > > Brs,
> > > > > > di.wu
> > > > > > 
> > > > > > > 2024年3月2日 17:50，Jeyhun Karimov je.kari...@gmail.com 写道：
> > > > > > > 
> > > > > > > Hi,
> > > > > > > 
> > > > > > > Thanks for the proposal. +1 for the FLIP.
> >

Re: [DISCUSS] FLIP Suggestion: Externalize Kudu Connector from Bahir

2024-03-11 Thread Ferenc Csaky

Hi Jing,

Thank you for your comments! Updated the FLIP with reasoning on the proposed 
release versions and included them in the headline "Release" field.

Best,
Ferenc




On Sunday, March 10th, 2024 at 16:59, Jing Ge  
wrote:

> 
> 
> Hi Ferenc,
> 
> Thanks for the proposal! +1 for it!
> 
> Similar to what Leonard mentioned. I would suggest:
> 1. Use the "release" to define the release version of the Kudu connector
> itself.
> 2. Optionally, add one more row underneath to describe which Flink versions
> this release will be compatible with, e.g. 1.17, 1.18. I think it makes
> sense to support at least two last Flink releases. An example could be
> found at [1]
> 
> Best regards,
> Jing
> 
> [1] https://lists.apache.org/thread/jcjfy3fgpg5cdnb9noslq2c77h0gtcwp
> 
> On Sun, Mar 10, 2024 at 3:46 PM Yanquan Lv decq12y...@gmail.com wrote:
> 
> > Hi Ferenc, +1 for this FLIP.
> > 
> > Ferenc Csaky ferenc.cs...@pm.me.invalid 于2024年3月9日周六 01:49写道：
> > 
> > > Thank you Jeyhun, Leonard, and Hang for your comments! Let me
> > > address them from earliest to latest.
> > > 
> > > > How do you plan the review process in this case (e.g. incremental
> > > > over existing codebase or cumulative all at once) ?
> > > 
> > > I think incremental would be less time consuming and complex for
> > > reviewers so I would leaning towards that direction. I would
> > > imagine multiple subtasks for migrating the existing code, and
> > > updating the deprecated interfaces, so those should be separate PRs and
> > > the release can be initiated when everything is merged.
> > > 
> > > > (1) About the release version, could you specify kudu connector version
> > > > instead of flink version 1.18 as external connector version is different
> > > > with flink?
> > > > (2) About the connector config options, could you enumerate these
> > > > options so that we can review they’re reasonable or not?
> > > 
> > > I added these to the FLIP, copied the current configs options as is,
> > > PTAL.
> > > 
> > > > (3) Metrics is also key part of connector, could you add the supported
> > > > connector metrics to public interface as well?
> > > 
> > > The current Bahir conenctor code does not include any metrics and I did
> > > not plan to include them into the scope of this FLIP.
> > > 
> > > > I think that how to state this code originally lived in Bahir may be in
> > > > the
> > > > FLIP.
> > > 
> > > I might miss your point, but the FLIP contains this: "Migrating the
> > > current code keeping the history and noting it explicitly it was forked
> > > from the Bahir repository [2]." Pls. share if you meant something else.
> > > 
> > > Best,
> > > Ferenc
> > > 
> > > On Friday, March 8th, 2024 at 10:42, Hang Ruan ruanhang1...@gmail.com
> > > wrote:
> > > 
> > > > Hi, Ferenc.
> > > > 
> > > > Thanks for the FLIP discussion. +1 for the proposal.
> > > > I think that how to state this code originally lived in Bahir may be in
> > > > the
> > > > FLIP.
> > > > 
> > > > Best,
> > > > Hang
> > > > 
> > > > Leonard Xu xbjt...@gmail.com 于2024年3月7日周四 14:14写道：
> > > > 
> > > > > Thanks Ferenc for kicking off this discussion, I left some comments
> > > > > here:
> > > > > 
> > > > > (1) About the release version, could you specify kudu connector
> > > > > version
> > > > > instead of flink version 1.18 as external connector version is
> > > > > different
> > > > > with flink ?
> > > > > 
> > > > > (2) About the connector config options, could you enumerate these
> > > > > options
> > > > > so that we can review they’re reasonable or not?
> > > > > 
> > > > > (3) Metrics is also key part of connector, could you add the
> > > > > supported
> > > > > connector metrics to public interface as well?
> > > > > 
> > > > > Best,
> > > > > Leonard
> > > > > 
> > > > > > 2024年3月6日 下午11:23，Ferenc Csaky ferenc.cs...@pm.me.INVALID 写道：
> > > > > > 
> > > > > > Hello devs,
> > > > > > 
> > > > > > Opening this thread to discuss a FLIP [1] about externalizing the
> > > > > > Kudu
> > > > > > connector, as recently
> > > > > > the Apache Bahir project were moved to the attic [2]. Some details
> > > > > > were
> > > > > > discussed already
> > > > > > in another thread [3]. I am proposing to externalize this connector
> > > > > > and
> > > > > > keep it maintainable,
> > > > > > and up to date.
> > > > > > 
> > > > > > Best regards,
> > > > > > Ferenc
> > > > > > 
> > > > > > [1]
> > 
> > https://docs.google.com/document/d/1vHF_uVe0FTYCb6PRVStovqDeqb_C_FKjt2P5xXa7uhE
> > 
> > > > > > [2] https://bahir.apache.org/
> > > > > > [3]
> > > > > > https://lists.apache.org/thread/2nb8dxxfznkyl4hlhdm3vkomm8rk4oyq

Re: [DISCUSS] FLIP Suggestion: Externalize Kudu Connector from Bahir

2024-03-08 Thread Ferenc Csaky

Thank you Jeyhun, Leonard, and Hang for your comments! Let me
address them from earliest to latest.

> How do you plan the review process in this case (e.g. incremental
over existing codebase or cumulative all at once) ?

I think incremental would be less time consuming and complex for
reviewers so I would leaning towards that direction. I would
imagine multiple subtasks for migrating the existing code, and
updating the deprecated interfaces, so those should be separate PRs and the 
release can be initiated when everything is merged.

> (1) About the release version, could you specify kudu connector version 
> instead of flink version 1.18 as external connector version is different with 
> flink?
> (2) About the connector config options, could you enumerate these options so 
> that we can review they’re reasonable or not?

I added these to the FLIP, copied the current configs options as is, PTAL.

> (3) Metrics is also key part of connector, could you add the supported 
> connector metrics to public interface as well?

The current Bahir conenctor code does not include any metrics and I did not 
plan to include them into the scope of this FLIP.

> I think that how to state this code originally lived in Bahir may be in the
FLIP.

I might miss your point, but the FLIP contains this: "Migrating the current 
code keeping the history and noting it explicitly it was forked from the Bahir 
repository [2]." Pls. share if you meant something else.

Best,
Ferenc



On Friday, March 8th, 2024 at 10:42, Hang Ruan  wrote:

> 
> 
> Hi, Ferenc.
> 
> Thanks for the FLIP discussion. +1 for the proposal.
> I think that how to state this code originally lived in Bahir may be in the
> FLIP.
> 
> Best,
> Hang
> 
> Leonard Xu xbjt...@gmail.com 于2024年3月7日周四 14:14写道：
> 
> > Thanks Ferenc for kicking off this discussion, I left some comments here:
> > 
> > (1) About the release version, could you specify kudu connector version
> > instead of flink version 1.18 as external connector version is different
> > with flink ?
> > 
> > (2) About the connector config options, could you enumerate these options
> > so that we can review they’re reasonable or not?
> > 
> > (3) Metrics is also key part of connector, could you add the supported
> > connector metrics to public interface as well?
> > 
> > Best,
> > Leonard
> > 
> > > 2024年3月6日 下午11:23，Ferenc Csaky ferenc.cs...@pm.me.INVALID 写道：
> > > 
> > > Hello devs,
> > > 
> > > Opening this thread to discuss a FLIP [1] about externalizing the Kudu
> > > connector, as recently
> > > the Apache Bahir project were moved to the attic [2]. Some details were
> > > discussed already
> > > in another thread [3]. I am proposing to externalize this connector and
> > > keep it maintainable,
> > > and up to date.
> > > 
> > > Best regards,
> > > Ferenc
> > > 
> > > [1]
> > > https://docs.google.com/document/d/1vHF_uVe0FTYCb6PRVStovqDeqb_C_FKjt2P5xXa7uhE
> > > [2] https://bahir.apache.org/
> > > [3] https://lists.apache.org/thread/2nb8dxxfznkyl4hlhdm3vkomm8rk4oyq

[DISCUSS] FLIP Suggestion: Externalize Kudu Connector from Bahir

2024-03-06 Thread Ferenc Csaky

Hello devs,

Opening this thread to discuss a FLIP [1] about externalizing the Kudu 
connector, as recently
the Apache Bahir project were moved to the attic [2]. Some details were 
discussed already
in another thread [3]. I am proposing to externalize this connector and keep it 
maintainable,
and up to date.

Best regards,
Ferenc

[1] 
https://docs.google.com/document/d/1vHF_uVe0FTYCb6PRVStovqDeqb_C_FKjt2P5xXa7uhE
[2] https://bahir.apache.org/
[3] https://lists.apache.org/thread/2nb8dxxfznkyl4hlhdm3vkomm8rk4oyq

Re: [DISCUSS] Apache Bahir retired

2024-03-06 Thread Ferenc Csaky

Hi Jing,

Thank you, I will create a new discussion thread.

Best,
Ferenc




On Wednesday, March 6th, 2024 at 10:43, Jing Ge  
wrote:

> 
> 
> Hi Ferenc,
> 
> +1 for it! The proposal looks good, thanks! I would suggest starting a new
> discuss thread following the new FLIP process created by Martijn[1].
> 
> [1] https://issues.apache.org/jira/browse/FLINK-34515
> 
> Best regards,
> Jing
> 
> On Fri, Mar 1, 2024 at 5:25 PM Ferenc Csaky ferenc.cs...@pm.me.invalid
> 
> wrote:
> 
> > Hi,
> > 
> > According to the current standards I created a Google doc for the FLIP
> > [1]. I pointed it to the this discussion but if a dedicated discussion
> > should be started for it, pls. let me know. PTAL.
> > 
> > Regards,
> > Ferenc
> > 
> > [1]
> > https://docs.google.com/document/d/1vHF_uVe0FTYCb6PRVStovqDeqb_C_FKjt2P5xXa7uhE/edit
> > 
> > On Thursday, February 29th, 2024 at 09:48, Ferenc Csaky
> > ferenc.cs...@pm.me.INVALID wrote:
> > 
> > > Thank you Marton and Martijn, I will proceed with the FLIP then.
> > > 
> > > Best,
> > > Ferenc
> > > 
> > > On Wednesday, February 28th, 2024 at 21:15, Martijn Visser
> > > martijnvis...@apache.org wrote:
> > > 
> > > > Hi all,
> > > > 
> > > > +1 to have a connector FLIP to propose a Kudu connector. I'm +0 overall
> > > > because I don't see a lot of activity happening in newly proposed
> > > > connectors, but if there's demand for it and people want to volunteer
> > > > with
> > > > contributions, there's no reason to block it.
> > > > 
> > > > Best regards,
> > > > 
> > > > Martijn
> > > > 
> > > > On Wed, Feb 28, 2024 at 4:31 PM Márton Balassi
> > > > balassi.mar...@gmail.com
> > > > 
> > > > wrote:
> > > > 
> > > > > Hi team,
> > > > > 
> > > > > Thanks for bringing this up, Feri. I am +1 for maintaining the Kudu
> > > > > connector as an external Flink connector.
> > > > > 
> > > > > As per the legal/trademark questions this is actually fair game
> > > > > because one
> > > > > does not donate code to a specific Apache project, technically it is
> > > > > donated to the Apache Software foundation. Consequently moving
> > > > > between ASF
> > > > > projects is fine, I would add a line to the NOTICE file stating that
> > > > > this
> > > > > code originally lived in Bahir once we forked it.
> > > > > 
> > > > > Although I did not find an easy to link precedent this is also
> > > > > implied in
> > > > > the Attic Bahir site [1] ("notify us if you fork outside Apache")
> > > > > and in
> > > > > this [2] Apache community dev list chat. We should notify the Attic
> > > > > team in
> > > > > any case. :-)
> > > > > 
> > > > > [1] https://attic.apache.org/projects/bahir.html
> > > > > [2] https://lists.apache.org/thread/p31mz4x4dcvd43f026d5p05rpglzfyrt
> > > > > 
> > > > > On Tue, Feb 27, 2024 at 10:09 AM Ferenc Csaky
> > > > > ferenc.cs...@pm.me.invalid
> > > > > wrote:
> > > > > 
> > > > > > Thank you Leonard for sharing your thoughts on this topic.
> > > > > > 
> > > > > > I agree that complying with the Flink community connector
> > > > > > development process would be a must, if there are no legal or
> > > > > > copyright issues, I would be happy to take that task for this
> > > > > > particular case.
> > > > > > 
> > > > > > I am no legal/copyright expert myslef, but Bahir uses the Apache
> > > > > > 2.0 license as well, so I believe it should be possible without
> > > > > > too many
> > > > > > complications, but I try to look for help on that front.
> > > > > > 
> > > > > > FYI we are using and supporting a downstream fork of the Kudu
> > > > > > connector
> > > > > > on
> > > > > > top of Flink 1.18 without any major modifications, so it is pretty
> > > > > > up to
> > > > > > date upstream as well.
> > > > > > 
> > > > > > Regards,
> > > > > > Ferenc

[jira] [Created] (FLINK-34580) Job run via REST erases "pipeline.classpaths" config

2024-03-05 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-34580:


 Summary: Job run via REST erases "pipeline.classpaths" config
 Key: FLINK-34580
 URL: https://issues.apache.org/jira/browse/FLINK-34580
 Project: Flink
  Issue Type: Bug
  Components: Runtime / REST
Affects Versions: 1.18.1, 1.17.2, 1.19.0
Reporter: Ferenc Csaky
 Fix For: 1.20.0


The 
[{{JarHandlerContext#applyToConfiguration}}|https://github.com/apache/flink/blob/e0b6c121eaf7aeb2974a45d199e452b022f07d29/flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/utils/JarHandlerUtils.java#L134]
 creates a {{PackagedProgram}} and then overwrites the {{pipeline.jars}} and 
{{pipeline.classpaths}} values according to that newly created 
{{{}PackagedProgram{}}}.

Although that [{{PackagedProgram}} 
init|https://github.com/apache/flink/blob/e0b6c121eaf7aeb2974a45d199e452b022f07d29/flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/utils/JarHandlerUtils.java#L185]
 does not set {{classpaths}} at all, so it will always overwrites the effective 
configuration with an empty value, even if it had something previously.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] Apache Bahir retired

2024-03-01 Thread Ferenc Csaky

Hi,

According to the current standards I created a Google doc for the FLIP [1]. I 
pointed it to the this discussion but if a dedicated discussion should be 
started for it, pls. let me know. PTAL.

Regards,
Ferenc

[1] 
https://docs.google.com/document/d/1vHF_uVe0FTYCb6PRVStovqDeqb_C_FKjt2P5xXa7uhE/edit




On Thursday, February 29th, 2024 at 09:48, Ferenc Csaky 
 wrote:

> 
> 
> Thank you Marton and Martijn, I will proceed with the FLIP then.
> 
> Best,
> Ferenc
> 
> 
> 
> 
> On Wednesday, February 28th, 2024 at 21:15, Martijn Visser 
> martijnvis...@apache.org wrote:
> 
> > Hi all,
> > 
> > +1 to have a connector FLIP to propose a Kudu connector. I'm +0 overall
> > because I don't see a lot of activity happening in newly proposed
> > connectors, but if there's demand for it and people want to volunteer with
> > contributions, there's no reason to block it.
> > 
> > Best regards,
> > 
> > Martijn
> > 
> > On Wed, Feb 28, 2024 at 4:31 PM Márton Balassi balassi.mar...@gmail.com
> > 
> > wrote:
> > 
> > > Hi team,
> > > 
> > > Thanks for bringing this up, Feri. I am +1 for maintaining the Kudu
> > > connector as an external Flink connector.
> > > 
> > > As per the legal/trademark questions this is actually fair game because 
> > > one
> > > does not donate code to a specific Apache project, technically it is
> > > donated to the Apache Software foundation. Consequently moving between ASF
> > > projects is fine, I would add a line to the NOTICE file stating that this
> > > code originally lived in Bahir once we forked it.
> > > 
> > > Although I did not find an easy to link precedent this is also implied in
> > > the Attic Bahir site [1] ("notify us if you fork outside Apache") and in
> > > this [2] Apache community dev list chat. We should notify the Attic team 
> > > in
> > > any case. :-)
> > > 
> > > [1] https://attic.apache.org/projects/bahir.html
> > > [2] https://lists.apache.org/thread/p31mz4x4dcvd43f026d5p05rpglzfyrt
> > > 
> > > On Tue, Feb 27, 2024 at 10:09 AM Ferenc Csaky ferenc.cs...@pm.me.invalid
> > > wrote:
> > > 
> > > > Thank you Leonard for sharing your thoughts on this topic.
> > > > 
> > > > I agree that complying with the Flink community connector
> > > > development process would be a must, if there are no legal or
> > > > copyright issues, I would be happy to take that task for this
> > > > particular case.
> > > > 
> > > > I am no legal/copyright expert myslef, but Bahir uses the Apache
> > > > 2.0 license as well, so I believe it should be possible without too many
> > > > complications, but I try to look for help on that front.
> > > > 
> > > > FYI we are using and supporting a downstream fork of the Kudu connector
> > > > on
> > > > top of Flink 1.18 without any major modifications, so it is pretty up to
> > > > date upstream as well.
> > > > 
> > > > Regards,
> > > > Ferenc
> > > > 
> > > > On Monday, February 26th, 2024 at 10:29, Leonard Xu xbjt...@gmail.com
> > > > wrote:
> > > > 
> > > > > Hey, Ferenc
> > > > > 
> > > > > Thanks for initiating this discussion. Apache Bahir is a great project
> > > > > that provided significant assistance to many Apache Flink/Spark users.
> > > > > It's
> > > > > pity news that it has been retired.
> > > > > 
> > > > > I believe that connectivity is crucial for building the ecosystem of
> > > > > the
> > > > > Flink such a computing engine. The community, or at least I, would
> > > > > actively
> > > > > support the introduction and maintenance of new connectors. Therefore,
> > > > > adding a Kudu connector or other connectors from Bahir makes sense to 
> > > > > me,
> > > > > as long as we adhere to the development process for connectors in the
> > > > > Flink
> > > > > community[1].
> > > > > I recently visited the Bahir Flink repository. Although the last
> > > > > release
> > > > > of Bahir Flink was in August ’22[2] which is compatible with Flink 
> > > > > 1.14,
> > > > > its latest code is compatible with Flink 1.17[3]. So, based on the
> > > > > existing
> > > > > codeba

Re: [DISCUSS] Apache Bahir retired

2024-02-29 Thread Ferenc Csaky

Thank you Marton and Martijn, I will proceed with the FLIP then.

Best,
Ferenc




On Wednesday, February 28th, 2024 at 21:15, Martijn Visser 
 wrote:

> 
> 
> Hi all,
> 
> +1 to have a connector FLIP to propose a Kudu connector. I'm +0 overall
> because I don't see a lot of activity happening in newly proposed
> connectors, but if there's demand for it and people want to volunteer with
> contributions, there's no reason to block it.
> 
> Best regards,
> 
> Martijn
> 
> On Wed, Feb 28, 2024 at 4:31 PM Márton Balassi balassi.mar...@gmail.com
> 
> wrote:
> 
> > Hi team,
> > 
> > Thanks for bringing this up, Feri. I am +1 for maintaining the Kudu
> > connector as an external Flink connector.
> > 
> > As per the legal/trademark questions this is actually fair game because one
> > does not donate code to a specific Apache project, technically it is
> > donated to the Apache Software foundation. Consequently moving between ASF
> > projects is fine, I would add a line to the NOTICE file stating that this
> > code originally lived in Bahir once we forked it.
> > 
> > Although I did not find an easy to link precedent this is also implied in
> > the Attic Bahir site [1] ("notify us if you fork outside Apache") and in
> > this [2] Apache community dev list chat. We should notify the Attic team in
> > any case. :-)
> > 
> > [1] https://attic.apache.org/projects/bahir.html
> > [2] https://lists.apache.org/thread/p31mz4x4dcvd43f026d5p05rpglzfyrt
> > 
> > On Tue, Feb 27, 2024 at 10:09 AM Ferenc Csaky ferenc.cs...@pm.me.invalid
> > wrote:
> > 
> > > Thank you Leonard for sharing your thoughts on this topic.
> > > 
> > > I agree that complying with the Flink community connector
> > > development process would be a must, if there are no legal or
> > > copyright issues, I would be happy to take that task for this
> > > particular case.
> > > 
> > > I am no legal/copyright expert myslef, but Bahir uses the Apache
> > > 2.0 license as well, so I believe it should be possible without too many
> > > complications, but I try to look for help on that front.
> > > 
> > > FYI we are using and supporting a downstream fork of the Kudu connector
> > > on
> > > top of Flink 1.18 without any major modifications, so it is pretty up to
> > > date upstream as well.
> > > 
> > > Regards,
> > > Ferenc
> > > 
> > > On Monday, February 26th, 2024 at 10:29, Leonard Xu xbjt...@gmail.com
> > > wrote:
> > > 
> > > > Hey, Ferenc
> > > > 
> > > > Thanks for initiating this discussion. Apache Bahir is a great project
> > > > that provided significant assistance to many Apache Flink/Spark users.
> > > > It's
> > > > pity news that it has been retired.
> > > > 
> > > > I believe that connectivity is crucial for building the ecosystem of
> > > > the
> > > > Flink such a computing engine. The community, or at least I, would
> > > > actively
> > > > support the introduction and maintenance of new connectors. Therefore,
> > > > adding a Kudu connector or other connectors from Bahir makes sense to 
> > > > me,
> > > > as long as we adhere to the development process for connectors in the
> > > > Flink
> > > > community[1].
> > > > I recently visited the Bahir Flink repository. Although the last
> > > > release
> > > > of Bahir Flink was in August ’22[2] which is compatible with Flink 1.14,
> > > > its latest code is compatible with Flink 1.17[3]. So, based on the
> > > > existing
> > > > codebase, developing an official Apache Flink connector for Kudu or 
> > > > other
> > > > connectors should be manageable. One point to consider is that if we're
> > > > not
> > > > developing a connector entirely from scratch but based on an existing
> > > > repository, we must ensure that there are no copyright issues. Here, "no
> > > > issues" means satisfying both Apache Bahir's and Apache Flink's 
> > > > copyright
> > > > requirements. Honestly, I'm not an expert in copyright or legal matters.
> > > > If
> > > > you're interested in contributing to the Kudu connector, it might be
> > > > necessary to attract other experienced community members to participate
> > > > in
> > > > this aspect.
> > > > 
> > > > Best,
&g

Re: [DISCUSS] Apache Bahir retired

2024-02-27 Thread Ferenc Csaky

Thank you Leonard for sharing your thoughts on this topic.

I agree that complying with the Flink community connector
development process would be a must, if there are no legal or
copyright issues, I would be happy to take that task for this
particular case.

I am no legal/copyright expert myslef, but Bahir uses the Apache
2.0 license as well, so I believe it should be possible without too many 
complications, but I try to look for help on that front.

FYI we are using and supporting a downstream fork of the Kudu connector on top 
of Flink 1.18 without any major modifications, so it is pretty up to date 
upstream as well.

Regards,
Ferenc




On Monday, February 26th, 2024 at 10:29, Leonard Xu  wrote:

> 
> 
> Hey, Ferenc
> 
> Thanks for initiating this discussion. Apache Bahir is a great project that 
> provided significant assistance to many Apache Flink/Spark users. It's pity 
> news that it has been retired.
> 
> I believe that connectivity is crucial for building the ecosystem of the 
> Flink such a computing engine. The community, or at least I, would actively 
> support the introduction and maintenance of new connectors. Therefore, adding 
> a Kudu connector or other connectors from Bahir makes sense to me, as long as 
> we adhere to the development process for connectors in the Flink community[1].
> I recently visited the Bahir Flink repository. Although the last release of 
> Bahir Flink was in August ’22[2] which is compatible with Flink 1.14, its 
> latest code is compatible with Flink 1.17[3]. So, based on the existing 
> codebase, developing an official Apache Flink connector for Kudu or other 
> connectors should be manageable. One point to consider is that if we're not 
> developing a connector entirely from scratch but based on an existing 
> repository, we must ensure that there are no copyright issues. Here, "no 
> issues" means satisfying both Apache Bahir's and Apache Flink's copyright 
> requirements. Honestly, I'm not an expert in copyright or legal matters. If 
> you're interested in contributing to the Kudu connector, it might be 
> necessary to attract other experienced community members to participate in 
> this aspect.
> 
> Best,
> Leonard
> 
> [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP+Connector+Template
> [2] https://github.com/apache/bahir-flink/releases/tag/v1.1.0
> [3] https://github.com/apache/bahir-flink/blob/master/pom.xml#L116
> 
> 
> 
> > 2024年2月22日 下午6:37，Ferenc Csaky ferenc.cs...@pm.me.INVALID 写道：
> > 
> > Hello devs,
> > 
> > Just saw that the Bahir project is retired [1]. Any plans on what's 
> > happening with the Flink connectors that were part of this project? We 
> > specifically use the Kudu connector and integrate it to our platform at 
> > Cloudera, so we would be okay to maintain it. Would it be possible to carry 
> > it over as separate connector repu under the Apache umbrella similarly as 
> > it happened with the external connectors previously?
> > 
> > Thanks,
> > Ferenc

Re: Flink's treatment to "hadoop" and "yarn" configuration overrides seems unintuitive

2024-02-26 Thread Ferenc Csaky

Thanks for the more shared details Venkata, I did not used spark widely myself, 
but if there are more examples
like follows that approach it makes sense to comply and be
consistent.

Regarding the planned release schedule on the 2.0 wiki page [1] the expected 
releases are 1.19 -> 1.20 -> 2.0. I am not sure about how realistic is that or 
there are any chance there will be a 1.21, but even if not, deprecating the 
current behavior for even 1.20 would not hurt IMO.

WDYT?

Looking for other opinios as well of course.

Regards,
Ferenc

[1] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release


On Tuesday, February 27th, 2024 at 07:08, Venkatakrishnan Sowrirajan 
 wrote:

> 
> 
> Thanks for sharing your thoughts, Ferenc.
> 
> Just my 2 cents, coming from Spark background that uses "spark.hadoop."
> prefix to handle all Hadoop configs, I prefer the "flink.hadoop." prefix
> and "flink.yarn." prefix. Generally, users use both these systems and I
> prefer to be consistent that way. Having said that, Flink doesn't need to
> be consistent with Spark so I am fine with the other approach as well.
> 
> I believe in order to make backwards incompatible changes, Flink needs the
> change to be in deprecated status for at least 2 minor versions which means
> we will already have 2.0, therefore this can probably go in 3.0 only.
> 
> It is still good to deprecate the current behavior and fix with the right
> behavior and get rid of this in 3.0 totally.
> 
> Looking for more thoughts from others in the community to make sure that I
> don't miss anything. Once the discussion settles, I can start a FLIP with
> the new proposal.
> 
> Thanks
> Venkat
> 
> 
> On Mon, Feb 26, 2024, 1:09 AM Ferenc Csaky ferenc.cs...@pm.me.invalid
> 
> wrote:
> 
> > Hi Venkata krishnan,
> > 
> > Thanks for starting a discussion on this topic. I completely
> > agree with you on that, this behavior can create confusion and
> > cause debugging sessions that could be spared with aligning how Flink
> > parses external properties.
> > 
> > Personally, I find the Yarn props prefixing more intuitive, but
> > I do not have strong opinions other than prefixing configs for
> > external systems should follow the same semantics and behavior.
> > 
> > It would make sense to align these in Flink 2.0 IMO, but I would
> > be curious about other opinions.
> > 
> > On Saturday, February 24th, 2024 at 07:36, Venkatakrishnan Sowrirajan <
> > vsowr...@asu.edu> wrote:
> > 
> > > Gentle ping on the ^^ question to surface this back up again. Any
> > > thoughts?
> > > 
> > > Regards
> > > Venkata krishnan
> > > 
> > > On Fri, Feb 16, 2024 at 7:32 PM Venkatakrishnan Sowrirajan
> > > vsowr...@asu.edu
> > > 
> > > wrote:
> > > 
> > > > Hi Flink devs,
> > > > 
> > > > Flink supports overriding "hadoop" and "yarn" configuration. As part of
> > > > the override mechanism, users have to prefix `hadoop` configs with "
> > > > flink.hadoop." and the prefix will be removed, while with `yarn`
> > > > configs
> > > > users have to prefix it with "flink.yarn." but "flink." only is
> > > > removed,
> > > > not "flink.yarn.".
> > > > 
> > > > Following is an example:
> > > > 
> > > > 1. "Hadoop" config
> > > > 
> > > > Hadoop config key = hadoop.tmp.dir => Flink config =
> > > > flink.hadoop.hadoop.tmp.dir => Hadoop's configuration object would have
> > > > hadoop.tmp.dir*.*
> > > > 
> > > > 2. "YARN" config
> > > > 
> > > > YARN config key = yarn.application.classpath => Flink config =
> > > > flink.yarn.yarn.application.classpath => YARN's configuration object
> > > > would have yarn.yarn.application.classpath*.*
> > > > 
> > > > Although this is documented
> > 
> > https://urldefense.com/v3/__https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/config/*flink-yarn-__;Iw!!IKRxdwAv5BmarQ!ewtlUgGysGDiWgKPr9D1bsGDp-jLagZqppUvXAtqvbO5lNMg7QTr4y5L4OL-hTFPO1qTR1nvh4ALBEQtm0RnE6X1WTEkp0Sb$
> > 
> > 
> > > > properly, it feels unintuitive and it tripped me, took quite a while to
> > > > understand why the above YARN configuration override was not working as
> > > > expected. Is this something that should be fixed? The problem with
> > > > fixing
> > > > it is, it will become backwards incompatible. Therefore, can this be
> > > > addressed as part of Flink-2.0?
> > > > 
> > > > Any thoughts?
> > > > 
> > > > Regards
> > > > Venkata krishnan

Re: [DISCUSS] Support the Ozone Filesystem

2024-02-26 Thread Ferenc Csaky

Hi,

gentle reminder on this thread, any opinions or thoughts?

Regards,
Ferenc




On Thursday, February 8th, 2024 at 18:02, Ferenc Csaky 
 wrote:

> 
> 
> Hello devs,
> 
> I would like to start a discussion regarding Apache Ozone FS support. The
> jira [1] is stale for quite a while, but supporting it with some limitations 
> could
> be done with minimal effort.
> 
> Ozone do not have truncate() impl, so it falls to the same category as
> Hadoop < 2.7 [2], on Datastream API it requires the usage of
> OnCheckpointRollingPolicy when checkpointing enabled to make sure
> the FileSink will not use truncate().
> 
> Table API is a bit trickier, because checkpointing policy cannot be ocnfigured
> explicitly (why?), it behaves differently regarding the write mode [3]. Bulk 
> mode
> is covered, but for fow format, auto-compaction has to be set.
> 
> Even with the mentioned limitations, I think it would worth to add support 
> for OFS,
> it would require 1 small change to enable "ofs" [4] and documenting the 
> limitations.
> 
> WDYT?
> 
> Regards,
> Ferenc
> 
> [1] https://issues.apache.org/jira/browse/FLINK-28231
> [2] 
> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/filesystem/#general
> [3] 
> https://github.com/apache/flink/blob/a33a0576364ac3d9c0c038c74362f1faac8d47b8/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/table/FileSystemTableSink.java#L226
> [4] 
> https://github.com/apache/flink/blob/a33a0576364ac3d9c0c038c74362f1faac8d47b8/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableWriter.java#L62

Re: Flink's treatment to "hadoop" and "yarn" configuration overrides seems unintuitive

2024-02-26 Thread Ferenc Csaky

Hi Venkata krishnan,

Thanks for starting a discussion on this topic. I completely
agree with you on that, this behavior can create confusion and
cause debugging sessions that could be spared with aligning how Flink parses 
external properties.

Personally, I find the Yarn props prefixing more intuitive, but
I do not have strong opinions other than prefixing configs for
external systems should follow the same semantics and behavior.

It would make sense to align these in Flink 2.0 IMO, but I would
be curious about other opinions.




On Saturday, February 24th, 2024 at 07:36, Venkatakrishnan Sowrirajan 
 wrote:

> 
> 
> Gentle ping on the ^^ question to surface this back up again. Any thoughts?
> 
> Regards
> Venkata krishnan
> 
> 
> On Fri, Feb 16, 2024 at 7:32 PM Venkatakrishnan Sowrirajan vsowr...@asu.edu
> 
> wrote:
> 
> > Hi Flink devs,
> > 
> > Flink supports overriding "hadoop" and "yarn" configuration. As part of
> > the override mechanism, users have to prefix `hadoop` configs with "
> > flink.hadoop." and the prefix will be removed, while with `yarn` configs
> > users have to prefix it with "flink.yarn." but "flink." only is removed,
> > not "flink.yarn.".
> > 
> > Following is an example:
> > 
> > 1. "Hadoop" config
> > 
> > Hadoop config key = hadoop.tmp.dir => Flink config =
> > flink.hadoop.hadoop.tmp.dir => Hadoop's configuration object would have
> > hadoop.tmp.dir*.*
> > 
> > 2. "YARN" config
> > 
> > YARN config key = yarn.application.classpath => Flink config =
> > flink.yarn.yarn.application.classpath => YARN's configuration object
> > would have yarn.yarn.application.classpath*.*
> > 
> > Although this is documented
> > https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/config/#flink-yarn-
> > properly, it feels unintuitive and it tripped me, took quite a while to
> > understand why the above YARN configuration override was not working as
> > expected. Is this something that should be fixed? The problem with fixing
> > it is, it will become backwards incompatible. Therefore, can this be
> > addressed as part of Flink-2.0?
> > 
> > Any thoughts?
> > 
> > Regards
> > Venkata krishnan

[jira] [Created] (FLINK-34506) Do not copy "file://" schemed artifact in standalone application modes

2024-02-23 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-34506:


 Summary: Do not copy "file://" schemed artifact in standalone 
application modes
 Key: FLINK-34506
 URL: https://issues.apache.org/jira/browse/FLINK-34506
 Project: Flink
  Issue Type: Bug
  Components: Client / Job Submission
Affects Versions: 1.19.0
Reporter: Ferenc Csaky


In standalone application mode, if an artifact is passed via a path witohut 
prefix, the file will be copied to `user.artifacts.base-dir`, although it 
should not be, as it can accessable locally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[DISCUSS] Apache Bahir retired

2024-02-22 Thread Ferenc Csaky

Hello devs,

Just saw that the Bahir project is retired [1]. Any plans on what's happening 
with the Flink connectors that were part of this project? We specifically use 
the Kudu connector and integrate it to our platform at Cloudera, so we would be 
okay to maintain it. Would it be possible to carry it over as separate 
connector repu under the Apache umbrella similarly as it happened with the 
external connectors previously?

Thanks,
Ferenc

[DISCUSS] Support the Ozone Filesystem

2024-02-08 Thread Ferenc Csaky

Hello devs,

I would like to start a discussion regarding Apache Ozone FS support. The
jira [1] is stale for quite a while, but supporting it with some limitations 
could
be done with minimal effort.

Ozone do not have truncate() impl, so it falls to the same category as
Hadoop < 2.7 [2], on Datastream API it requires the usage of
OnCheckpointRollingPolicy when checkpointing enabled to make sure
the FileSink will not use truncate().

Table API is a bit trickier, because checkpointing policy cannot be ocnfigured
explicitly (why?), it behaves differently regarding the write mode [3]. Bulk 
mode
is covered, but for fow format, auto-compaction has to be set.

Even with the mentioned limitations, I think it would worth to add support for 
OFS,
it would require 1 small change to enable "ofs" [4] and documenting the 
limitations.

WDYT?

Regards,
Ferenc

[1] https://issues.apache.org/jira/browse/FLINK-28231
[2] 
https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/filesystem/#general
[3] 
https://github.com/apache/flink/blob/a33a0576364ac3d9c0c038c74362f1faac8d47b8/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/table/FileSystemTableSink.java#L226
[4] 
https://github.com/apache/flink/blob/a33a0576364ac3d9c0c038c74362f1faac8d47b8/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableWriter.java#L62

Re: [VOTE] Release flink-connector-hbase v3.0.1, release candidate #2

2024-02-06 Thread Ferenc Csaky

+1 (non-binding)

- Validated checksum
- Verified signature
- Verified no binaries in src archive
- Built with Maven 3.8.6 + JDK11
- Verified web PR

BR,
Ferenc


On Saturday, January 13th, 2024 at 04:59, Hang Ruan  
wrote:

> 
> 
> +1 (non-binding)
> 
> - Validated checksum hash
> - Verified signature
> - Verified that no binaries exist in the source archive
> - Build the source with Maven and jdk11
> - Verified web PR
> 
> Best,
> Hang
> 
> Martijn Visser martijnvis...@apache.org 于2024年1月12日周五 20:30写道：
> 
> > Hi everyone,
> > Please review and vote on the release candidate #2 for the
> > flink-connector-hbase version
> > 3.0.1, as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> > 
> > This version is compatible with Flink 1.16.x, 1.17.x and 1.18.x
> > 
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release to be deployed to dist.apache.org
> > [2], which are signed with the key with fingerprint
> > A5F3BCE4CBE993573EC5966A65321B8382B219AF [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag v3.0.1-rc1 [5],
> > * website pull request listing the new release [6].
> > 
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> > 
> > Thanks,
> > Release Manager
> > 
> > [1] https://issues.apache.org/jira/projects/FLINK/versions/12353603
> > [2]
> > https://dist.apache.org/repos/dist/dev/flink/flink-connector-hbase-3.0.1-rc2
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1696/
> > [5]
> > https://github.com/apache/flink-connector-hbase/releases/tag/v3.0.1-rc2
> > [6] https://github.com/apache/flink-web/pull/708

[jira] [Created] (FLINK-34388) Release Testing: Verify FLINK-28915 Support artifact fetching in Standalone and native K8s application mode

2024-02-06 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-34388:


 Summary: Release Testing: Verify FLINK-28915 Support artifact 
fetching in Standalone and native K8s application mode
 Key: FLINK-34388
 URL: https://issues.apache.org/jira/browse/FLINK-34388
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics
Affects Versions: 1.19.0
Reporter: Ferenc Csaky
 Fix For: 1.19.0


This ticket covers testing three related features: FLINK-33695, FLINK-33735 and 
FLINK-33696.

Instructions:
#  Configure Flink to use 
[Slf4jTraceReporter|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/trace_reporters/#slf4j]
 and with enabled *INFO* level logging (can be to console or to a file, doesn't 
matter).
# Start a streaming job with enabled checkpointing.
# Let it run for a couple of checkpoints.
# Verify presence of a single *JobInitialization* [1] trace logged just after 
job start up.
# Verify presence of a couple of *Checkpoint* [1] traces logged after each 
successful or failed checkpoint.

[1] 
https://nightlies.apache.org/flink/flink-docs-master/docs/ops/traces/#checkpointing-and-initialization



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] Drop support for HBase v1

2024-01-30 Thread Ferenc Csaky

Hi Martijn,

thanks for starting the discussion. Let me link the older discussion regarding 
the same topic [1]. My opinion did not change, so +1.

BR,
Ferenc

[1] https://lists.apache.org/thread/x7l2gj8g93r4v6x6953cyt6jrs8c4r1b




On Monday, January 29th, 2024 at 09:37, Martijn Visser 
 wrote:

> 
> 
> Hi all,
> 
> While working on adding support for Flink 1.19 for HBase, we've run into a
> dependency convergence issue because HBase v1 relies on a really old
> version of Guava.
> 
> HBase v2 has been made available since May 2018, and there have been no new
> releases of HBase v1 since August 2022.
> 
> I would like to propose that the Flink HBase connector drops support for
> HBase v1, and will only continue HBase v2 in the future. I don't think this
> requires a full FLIP and vote, but I do want to start a discussion thread
> for this.
> 
> Best regards,
> 
> Martijn

Re: [VOTE] FLIP-393: Make QueryOperations SQL serializable

2023-11-21 Thread Ferenc Csaky

+1 (non-binding)

Lookgin forward to this!

Best,
Ferenc




On Tuesday, November 21st, 2023 at 12:21, Martijn Visser 
 wrote:


> 
> 
> +1 (binding)
> 
> Thanks for driving this.
> 
> Best regards,
> 
> Martijn
> 
> On Tue, Nov 21, 2023 at 12:18 PM Benchao Li libenc...@apache.org wrote:
> 
> > +1 (binding)
> > 
> > Dawid Wysakowicz wysakowicz.da...@gmail.com 于2023年11月21日周二 18:56写道：
> > 
> > > Hi everyone,
> > > 
> > > Thank you to everyone for the feedback on FLIP-393: Make QueryOperations
> > > SQL serializable[1]
> > > which has been discussed in this thread [2].
> > > 
> > > I would like to start a vote for it. The vote will be open for at least 72
> > > hours unless there is an objection or not enough votes.
> > > 
> > > [1] https://cwiki.apache.org/confluence/x/vQ2ZE
> > > [2] https://lists.apache.org/thread/ztyk68brsbmwwo66o1nvk3f6fqqhdxgk
> > 
> > --
> > 
> > Best,
> > Benchao Li

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-11-20 Thread Ferenc Csaky

Hello devs,

Is any active work happening on this FLIP? As far as I see there
are blockers that needs to happen first to implement regarding
artifact distribution.

Is this work in halt completetly or some efforts are going into
resolve the blockers first or something?

Our platform would benefit this feature a lot, we have a kind of
working custom implementation at the moment, but it is uniquely
adapted to our app and platform.

I could help out to move this forward.

Best,
Ferenc



On Friday, June 30th, 2023 at 04:53, Paul Lam  wrote:


> 
> 
> Hi Jing,
> 
> Thanks for your input!
> 
> > Would you like to add
> > one section to describe(better with script/code example) how to use it in
> > these two scenarios from users' perspective?
> 
> 
> OK. I’ll update the FLIP with the code snippet after I get the POC branch 
> done.
> 
> > NIT: the pictures have transparent background when readers click on it. It
> > would be great if you can replace them with pictures with white background.
> 
> 
> Fixed. Thanks for pointing that out :)
> 
> Best,
> Paul Lam
> 
> > 2023年6月27日 06:51，Jing Ge j...@ververica.com.INVALID 写道：
> > 
> > Hi Paul,
> > 
> > Thanks for driving it and thank you all for the informative discussion! The
> > FLIP is in good shape now. As described in the FLIP, SQL Driver will be
> > mainly used to run Flink SQLs in two scenarios: 1. SQL client/gateway in
> > application mode and 2. external system integration. Would you like to add
> > one section to describe(better with script/code example) how to use it in
> > these two scenarios from users' perspective?
> > 
> > NIT: the pictures have transparent background when readers click on it. It
> > would be great if you can replace them with pictures with white background.
> > 
> > Best regards,
> > Jing
> > 
> > On Mon, Jun 26, 2023 at 1:31 PM Paul Lam  > mailto:paullin3...@gmail.com> wrote:
> > 
> > > Hi Shengkai,
> > > 
> > > > * How can we ship the json plan to the JobManager?
> > > 
> > > The Flink K8s module should be responsible for file distribution. We could
> > > introduce
> > > an option like `kubernetes.storage.dir`. For each flink cluster, there
> > > would be a
> > > dedicated subdirectory, with the pattern like
> > > `${kubernetes.storage.dir}/${cluster-id}`.
> > > 
> > > All resources-related options (e.g. pipeline jars, json plans) that are
> > > configured with
> > > scheme `file://`  > would be 
> > > uploaded to the resource directory
> > > and downloaded to the
> > > jobmanager, before SQL Driver accesses the files with the original
> > > filenames.
> > > 
> > > > * Classloading strategy
> > > 
> > > We could directly specify the SQL Gateway jar as the jar file in
> > > PackagedProgram.
> > > It would be treated like a normal user jar and the SQL Driver is loaded
> > > into the user
> > > classloader. WDYT?
> > > 
> > > > * Option `$internal.sql-gateway.driver.sql-config` is string type
> > > > I think it's better to use Map type here
> > > 
> > > By Map type configuration, do you mean a nested map that contains all
> > > configurations?
> > > 
> > > I hope I've explained myself well, it’s a file that contains the extra SQL
> > > configurations, which would be shipped to the jobmanager.
> > > 
> > > > * PoC branch
> > > 
> > > Sure. I’ll let you know once I get the job done.
> > > 
> > > Best,
> > > Paul Lam
> > > 
> > > > 2023年6月26日 14:27，Shengkai Fang  > > > mailto:fskm...@gmail.com> 写道：
> > > > 
> > > > Hi, Paul.
> > > > 
> > > > Thanks for your update. I have a few questions about the new design:
> > > > 
> > > > * How can we ship the json plan to the JobManager?
> > > > 
> > > > The current design only exposes an option about the URL of the json
> > > > plan. It seems the gateway is responsible to upload to an external 
> > > > stroage.
> > > > Can we reuse the PipelineOptions.JARS to ship to the remote filesystem?
> > > > 
> > > > * Classloading strategy
> > > > 
> > > > Currently, the Driver is in the sql-gateway package. It means the Driver
> > > > is not in the JM's classpath directly. Because the sql-gateway jar is 
> > > > now
> > > > in the opt directory rather than lib directory. It may need to add the
> > > > external dependencies as Python does[1]. BTW, I think it's better to 
> > > > move
> > > > the Driver into the flink-table-runtime package, which is much easier to
> > > > find(Sorry for the wrong opinion before).
> > > > 
> > > > * Option `$internal.sql-gateway.driver.sql-config` is string type
> > > > 
> > > > I think it's better to use Map type here
> > > > 
> > > > * PoC branch
> > > > 
> > > > Because this FLIP involves many modules, do you have a PoC branch to
> > > > verify it does work?
> > > > 
> > > > Best,
> > > > Shengkai
> > > > 
> > > > [1]
> > > > https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L940
> > > >  
> > > >

[jira] [Created] (FLINK-33542) Update HBase connector tests to JUnit5

2023-11-14 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-33542:


 Summary: Update HBase connector tests to JUnit5
 Key: FLINK-33542
 URL: https://issues.apache.org/jira/browse/FLINK-33542
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / HBase
Reporter: Ferenc Csaky






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [RESULT][VOTE] Release flink-connector-hbase v3.0.0, release candidate #2

2023-11-02 Thread Ferenc Csaky

Hi Martijn!

Is this work in progress?

Thanks,
Ferenc




--- Original Message ---
On Tuesday, September 12th, 2023 at 10:47, Martijn Visser 
 wrote:


> 
> 
> I'm happy to announce that we have unanimously approved this release.
> 
> There are 7 approving votes, 3 of which are binding:
> * Ahmed (non-binding)
> * Sergey (non-binding)
> * Samrat (non-binding)
> * Ferenc (non-binding)
> * Danny (binding)
> * Leonard (binding)
> * Dong (binding)
> 
> There are no disapproving votes.
> 
> I'll work on completing the release. Thanks all!
> 
> Best regards,
> 
> Martijn

[jira] [Created] (FLINK-33440) Bump flink version on flink-connectors-hbase

2023-11-02 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-33440:


 Summary: Bump flink version on flink-connectors-hbase
 Key: FLINK-33440
 URL: https://issues.apache.org/jira/browse/FLINK-33440
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / HBase
Reporter: Ferenc Csaky


Follow-up the 1.18 release in the connector repo as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-33353) SQL fails because "TimestampType.kind" is not serialized

2023-10-24 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-33353:


 Summary: SQL fails because "TimestampType.kind" is not serialized 
 Key: FLINK-33353
 URL: https://issues.apache.org/jira/browse/FLINK-33353
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Affects Versions: 1.18.0
Reporter: Ferenc Csaky


We have a custom persistent catalog store, which stores tables, views etc. in a 
DB. In our application, it is required to utilize the serialized formats of 
entities, but the same applies to the Hive, as it functions as a persistent 
catalog.

Take the following example SQL:

{code:sql}
CREATE TABLE IF NOT EXISTS `txn_gen` (
  `txn_id` INT,
  `amount` INT,
  `ts` TIMESTAMP(3),
   WATERMARK FOR `ts` AS `ts` - INTERVAL '1' SECOND
) WITH (
  'connector' = 'datagen',
  'fields.txn_id.min' = '1',
  'fields.txn_id.max' = '5',
  'rows-per-second' = '1'
);

CREATE VIEW IF NOT EXISTS aggr_ten_sec AS
  SELECT txn_id,
 TUMBLE_ROWTIME(`ts`, INTERVAL '10' SECOND) AS w_row_time,
 COUNT(txn_id) AS txn_count
FROM txn_gen
GROUP BY txn_id, TUMBLE(`ts`, INTERVAL '10' SECOND);

SELECT txn_id,
   SUM(txn_count),
   TUMBLE_START(w_row_time, INTERVAL '20' SECOND) AS total_txn_count
  FROM aggr_ten_sec
  GROUP BY txn_id, TUMBLE(w_row_time, INTERVAL '20' SECOND);
{code}

This will work without any problems when we simply execute it in a 
{{TableEnvironment}}, but it fails with the below error when we try to execute 
the query based on the serialized table metadata.
{code}
org.apache.flink.table.api.TableException: Window aggregate can only be defined 
over a time attribute column, but TIMESTAMP(3) encountered.
{code}

If there is a view which would require to use ROWTIME, it will be lost and we 
cannot recreate the same query from the serialized entites.

Currently in {{TimestampType}} the "kind" field is deliberatly annotated as 
{{@Internal}} and is not serialized, although it breaks this functionality.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [ANNOUNCE] Release 1.18.0, release candidate #1

2023-10-03 Thread Ferenc Csaky

Thanks everyone for the efforts!

Checked the following:

- Downloaded artifacts
- Built Flink from source
- Verified checksums/signatures
- Verified NOTICE, LICENSE files
- Deployed dummy SELECT job via SQL gateway on standalone cluster, things 
seemed fine according to the log files

+1 (non-binding)

Best,
Ferenc


--- Original Message ---
On Friday, September 29th, 2023 at 22:12, Gabor Somogyi 
 wrote:


> 
> 
> Thanks for the efforts!
> 
> +1 (non-binding)
> 
> * Verified versions in the poms
> * Built from source
> * Verified checksums and signatures
> * Started basic workloads with kubernetes operator
> * Verified NOTICE and LICENSE files
> 
> G
> 
> On Fri, Sep 29, 2023, 18:16 Matthias Pohl matthias.p...@aiven.io.invalid
> 
> wrote:
> 
> > Thanks for creating RC1. I did the following checks:
> > 
> > * Downloaded artifacts
> > * Built Flink from sources
> > * Verified SHA512 checksums GPG signatures
> > * Compared checkout with provided sources
> > * Verified pom file versions
> > * Went over NOTICE file/pom files changes without finding anything
> > suspicious
> > * Deployed standalone session cluster and ran WordCount example in batch
> > and streaming: Nothing suspicious in log files found
> > 
> > +1 (binding)
> > 
> > On Fri, Sep 29, 2023 at 10:34 AM Etienne Chauchot echauc...@apache.org
> > wrote:
> > 
> > > Hi all,
> > > 
> > > Thanks to the team for this RC.
> > > 
> > > I did a quick check of this RC against user pipelines (1) coded with
> > > DataSet (even if deprecated and soon removed), DataStream and SQL APIs
> > > 
> > > based on the small scope of this test, LGTM
> > > 
> > > +1 (non-binding)
> > > 
> > > [1] https://github.com/echauchot/tpcds-benchmark-flink
> > > 
> > > Best
> > > Etienne
> > > 
> > > Le 28/09/2023 à 19:35, Jing Ge a écrit :
> > > 
> > > > Hi everyone,
> > > > 
> > > > The RC1 for Apache Flink 1.18.0 has been created. The related voting
> > > > process will be triggered once the announcement is ready. The RC1 has
> > > > all
> > > > the artifacts that we would typically have for a release, except for
> > > > the
> > > > release note and the website pull request for the release announcement.
> > > > 
> > > > The following contents are available for your review:
> > > > 
> > > > - Confirmation of no benchmarks regression at the thread[1].
> > > > - The preview source release and binary convenience releases [2], which
> > > > are signed with the key with fingerprint 96AE0E32CBE6E0753CE6 [3].
> > > > - all artifacts that would normally be deployed to the Maven
> > > > Central Repository [4].
> > > > - source code tag "release-1.18.0-rc1" [5]
> > > > 
> > > > Your help testing the release will be greatly appreciated! And we'll
> > > > create the rc1 release and the voting thread as soon as all the efforts
> > > > are
> > > > finished.
> > > > 
> > > > [1]https://lists.apache.org/thread/yxyphglwwvq57wcqlfrnk3qo9t3sr2ro
> > > > [2]https://dist.apache.org/repos/dist/dev/flink/flink-1.18.0-rc1/
> > > > [3]https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > [4]
> > > > https://repository.apache.org/content/repositories/orgapacheflink-1657
> > > > [5]https://github.com/apache/flink/releases/tag/release-1.18.0-rc1
> > > > 
> > > > Best regards,
> > > > Qingsheng, Sergei, Konstantin and Jing

Re: Future of classical remoting in Pekko

2023-09-20 Thread Ferenc Csaky

That is a fair point. It not fixes a bug per se, but it would mitigate security 
vulnerabilities (Netty 3.x CVEs), so my thought was it might qualify it for 
addressing in a patch release.

IMO handling security vulnerabilities is a gray area, if it only requires to 
bump some deps that are only used internally, it can be considered as a patch 
compatible change.

I am not sure at this point what changes would be required to use the updated 
Pekko version, so it is possible that we can only introduce it in 1.19, just 
wanted to clarify my thought process.


--- Original Message ---
On Wednesday, September 20th, 2023 at 14:26, Martijn Visser 
 wrote:


> 
> 
> Just chipping in that I don't think we should add Pekko changes in a
> patch release, because I think the Pekko related changes don't fix a
> bug.
> 
> On Tue, Sep 19, 2023 at 9:06 PM Ferenc Csaky ferenc.cs...@pm.me.invalid wrote:
> 
> > I think that is totally fine, because any Pekko related changes can only be 
> > added to the first patch release of 1.18 at this point, as there is an RC0 
> > [1] already so the release process will be initiated soon.
> > 
> > I am glad the mentioned PR got merged, did not have the chance to review.
> > 
> > [1] https://lists.apache.org/thread/5x28rp3zct4p603hm4zdwx6kfr101w38
> > 
> > --- Original Message ---
> > On Monday, September 18th, 2023 at 14:20, Matthew de Detrich 
> > matthew.dedetr...@aiven.io.INVALID wrote:
> > 
> > > I think that the end of September is too soon for a Pekko 1.1.x, there are
> > > still more things
> > > that we would like to merge before making a release.
> > > 
> > > Good news is that the PR to migrate to netty4 for classic remoting has 
> > > been
> > > merged
> > > (see https://github.com/apache/incubator-pekko/pull/643). Improvements are
> > > also
> > > still be done, so the next minor version release of Pekko (1.1.0) will
> > > contain these
> > > changes.
> > > 
> > > On Wed, Sep 13, 2023 at 11:22 AM Ferenc Csaky ferenc.cs...@pm.me.invalid
> > > 
> > > wrote:
> > > 
> > > > The target release date for 1.18 is the end of Sept [1], but I'm not 
> > > > sure
> > > > everything will come together by then. Maybe it will pushed by a couple
> > > > days.
> > > > 
> > > > I'm happy to help out, even making the Flink related changes when we're 
> > > > at
> > > > that point.
> > > > 
> > > > [1] https://cwiki.apache.org/confluence/display/FLINK/1.18+Release
> > > > 
> > > > --- Original Message ---
> > > > On Tuesday, September 12th, 2023 at 17:43, He Pin he...@apache.org
> > > > wrote:
> > > > 
> > > > > Hi Ferenc:
> > > > > What's the ETA of the Flink 1.18? I think we should beable to
> > > > > collaborate on this，and at work we are using Flink too.
> > > > > 
> > > > > On 2023/09/12 15:16:11 Ferenc Csaky wrote:
> > > > > 
> > > > > > Hi Matthew,
> > > > > > 
> > > > > > Thanks for bringing this up! Cca half a year ago I started to work 
> > > > > > on
> > > > > > an Akka Artery migration, there is a draft PR for that 1. It might 
> > > > > > be an
> > > > > > option to revive that work and point it against Pekko instead. 
> > > > > > Although I
> > > > > > would highlight FLINK-29281 2 which will replace the whole RPC
> > > > > > implementation in Flink to a gRPC-based one when it is done.
> > > > > > 
> > > > > > I am not sure about the progess on the gRPC work, it looks hanging 
> > > > > > for
> > > > > > a while now, so I think if there is a chance to replace Netty3 with 
> > > > > > Netty4
> > > > > > in Pekko in the short term it would benefit Flink and then we can 
> > > > > > decide if
> > > > > > it would worth to upgrade to Artery, or how fast the gRPC solution 
> > > > > > can be
> > > > > > done and then it will not be necessary.
> > > > > > 
> > > > > > All in all, in the short term I think Flink would benefit to have 
> > > > > > that
> > > > > > mentioned PR 3 merged, then the updated Pekko version could be 
> > > > > > included in
> > > > > > the first 1.18 patch proba

Re: Future of classical remoting in Pekko

2023-09-19 Thread Ferenc Csaky

I think that is totally fine, because any Pekko related changes can only be 
added to the first patch release of 1.18 at this point, as there is an RC0 [1] 
already so the release process will be initiated soon.

I am glad the mentioned PR got merged, did not have the chance to review.

[1] https://lists.apache.org/thread/5x28rp3zct4p603hm4zdwx6kfr101w38



--- Original Message ---
On Monday, September 18th, 2023 at 14:20, Matthew de Detrich 
 wrote:


> 
> 
> I think that the end of September is too soon for a Pekko 1.1.x, there are
> still more things
> that we would like to merge before making a release.
> 
> Good news is that the PR to migrate to netty4 for classic remoting has been
> merged
> (see https://github.com/apache/incubator-pekko/pull/643). Improvements are
> also
> still be done, so the next minor version release of Pekko (1.1.0) will
> contain these
> changes.
> 
> On Wed, Sep 13, 2023 at 11:22 AM Ferenc Csaky ferenc.cs...@pm.me.invalid
> 
> wrote:
> 
> > The target release date for 1.18 is the end of Sept [1], but I'm not sure
> > everything will come together by then. Maybe it will pushed by a couple
> > days.
> > 
> > I'm happy to help out, even making the Flink related changes when we're at
> > that point.
> > 
> > [1] https://cwiki.apache.org/confluence/display/FLINK/1.18+Release
> > 
> > --- Original Message ---
> > On Tuesday, September 12th, 2023 at 17:43, He Pin he...@apache.org
> > wrote:
> > 
> > > Hi Ferenc:
> > > What's the ETA of the Flink 1.18? I think we should beable to
> > > collaborate on this，and at work we are using Flink too.
> > > 
> > > On 2023/09/12 15:16:11 Ferenc Csaky wrote:
> > > 
> > > > Hi Matthew,
> > > > 
> > > > Thanks for bringing this up! Cca half a year ago I started to work on
> > > > an Akka Artery migration, there is a draft PR for that 1. It might be an
> > > > option to revive that work and point it against Pekko instead. Although 
> > > > I
> > > > would highlight FLINK-29281 2 which will replace the whole RPC
> > > > implementation in Flink to a gRPC-based one when it is done.
> > > > 
> > > > I am not sure about the progess on the gRPC work, it looks hanging for
> > > > a while now, so I think if there is a chance to replace Netty3 with 
> > > > Netty4
> > > > in Pekko in the short term it would benefit Flink and then we can 
> > > > decide if
> > > > it would worth to upgrade to Artery, or how fast the gRPC solution can 
> > > > be
> > > > done and then it will not be necessary.
> > > > 
> > > > All in all, in the short term I think Flink would benefit to have that
> > > > mentioned PR 3 merged, then the updated Pekko version could be included 
> > > > in
> > > > the first 1.18 patch probably to mitigate those pesky Netty3 CVEs that 
> > > > are
> > > > carried for a while ASAP.
> > > > 
> > > > Cheers,
> > > > Ferenc
> > > > 
> > > > 1 https://github.com/apache/flink/pull/22271
> > > > 2 https://issues.apache.org/jira/browse/FLINK-29281
> > > > 3 https://github.com/apache/incubator-pekko/pull/643
> > > > 
> > > > --- Original Message ---
> > > > On Tuesday, September 12th, 2023 at 10:29, Matthew de Detrich
> > > > matthew.dedetr...@aiven.io.INVALID wrote:
> > > > 
> > > > > It's come to my attention that Flink is using Pekko's classical
> > > > > remoting,
> > > > > if this is the case then I would recommend making a response at
> > > > > https://lists.apache.org/thread/19h2wrs2om91g5vhnftv583fo0ddfshm .
> > > > > 
> > > > > Quick summary of what is being discussed is what to do with Pekko's
> > > > > classical remoting. Classic remoting is considered deprecated since
> > > > > 2019,
> > > > > an artifact that we inherited from Akka1. Ontop of this classical
> > > > > remoting happens to be using netty3 which has known CVE's2, these
> > > > > CVE's
> > > > > were never fixed in the netty3 series.
> > > > > 
> > > > > The question is what should be done given this, i.e. some people in
> > > > > the
> > > > > Pekko community are wanting to drop classical remoting as quickly as
> > > > > possible (i.e. even sooner then what semver allows but this is being
>

Re: [VOTE] FLIP-334: Decoupling autoscaler and kubernetes and support the Standalone Autoscaler

2023-09-13 Thread Ferenc Csaky

Looking forward to this!

+1 (non-binding)

Cheers,
Ferenc


--- Original Message ---
On Wednesday, September 13th, 2023 at 12:33, Maximilian Michels 
 wrote:


> 
> 
> +1 (binding)
> 
> On Wed, Sep 13, 2023 at 12:28 PM Gyula Fóra gyula.f...@gmail.com wrote:
> 
> > +1 (binding)
> > 
> > Gyula
> > 
> > On Wed, 13 Sep 2023 at 09:33, Matt Wang wang...@163.com wrote:
> > 
> > > Thank you for driving this FLIP,
> > > 
> > > +1 (non-binding)
> > > 
> > > --
> > > 
> > > Best,
> > > Matt Wang
> > > 
> > >  Replied Message 
> > > | From | conradjamjam.gz...@gmail.com |
> > > | Date | 09/13/2023 15:28 |
> > > | To | dev@flink.apache.org |
> > > | Subject | Re: [VOTE] FLIP-334: Decoupling autoscaler and kubernetes and
> > > support the Standalone Autoscaler |
> > > best idea
> > > +1 (non-binding)
> > > 
> > > Ahmed Hamdy hamdy10...@gmail.com 于2023年9月13日周三 15:23写道：
> > > 
> > > Hi Rui,
> > > I have gone through the thread.
> > > +1 (non-binding)
> > > 
> > > Best Regards
> > > Ahmed Hamdy
> > > 
> > > On Wed, 13 Sept 2023 at 03:53, Rui Fan 1996fan...@gmail.com wrote:
> > > 
> > > Hi all,
> > > 
> > > Thanks for all the feedback about the FLIP-334:
> > > Decoupling autoscaler and kubernetes and
> > > support the Standalone Autoscaler[1].
> > > This FLIP was discussed in [2].
> > > 
> > > I'd like to start a vote for it. The vote will be open for at least 72
> > > hours (until Sep 16th 11:00 UTC+8) unless there is an objection or
> > > insufficient votes.
> > > 
> > > [1]
> > > 
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-334+%3A+Decoupling+autoscaler+and+kubernetes+and+support+the+Standalone+Autoscaler
> > > [2] https://lists.apache.org/thread/kmm03gls1vw4x6vk1ypr9ny9q9522495
> > > 
> > > Best,
> > > Rui
> > > 
> > > --
> > > Best
> > > 
> > > ConradJam

Re: Future of classical remoting in Pekko

2023-09-13 Thread Ferenc Csaky

The target release date for 1.18 is the end of Sept [1], but I'm not sure 
everything will come together by then. Maybe it will pushed by a couple days.

I'm happy to help out, even making the Flink related changes when we're at that 
point.

[1] https://cwiki.apache.org/confluence/display/FLINK/1.18+Release

--- Original Message ---
On Tuesday, September 12th, 2023 at 17:43, He Pin  wrote:


> 
> 
> Hi Ferenc:
> What's the ETA of the Flink 1.18? I think we should beable to collaborate on 
> this，and at work we are using Flink too.
> 
> On 2023/09/12 15:16:11 Ferenc Csaky wrote:
> 
> > Hi Matthew,
> > 
> > Thanks for bringing this up! Cca half a year ago I started to work on an 
> > Akka Artery migration, there is a draft PR for that 1. It might be an 
> > option to revive that work and point it against Pekko instead. Although I 
> > would highlight FLINK-29281 2 which will replace the whole RPC 
> > implementation in Flink to a gRPC-based one when it is done.
> > 
> > I am not sure about the progess on the gRPC work, it looks hanging for a 
> > while now, so I think if there is a chance to replace Netty3 with Netty4 in 
> > Pekko in the short term it would benefit Flink and then we can decide if it 
> > would worth to upgrade to Artery, or how fast the gRPC solution can be done 
> > and then it will not be necessary.
> > 
> > All in all, in the short term I think Flink would benefit to have that 
> > mentioned PR 3 merged, then the updated Pekko version could be included in 
> > the first 1.18 patch probably to mitigate those pesky Netty3 CVEs that are 
> > carried for a while ASAP.
> > 
> > Cheers,
> > Ferenc
> > 
> > 1 https://github.com/apache/flink/pull/22271
> > 2 https://issues.apache.org/jira/browse/FLINK-29281
> > 3 https://github.com/apache/incubator-pekko/pull/643
> > 
> > --- Original Message ---
> > On Tuesday, September 12th, 2023 at 10:29, Matthew de Detrich 
> > matthew.dedetr...@aiven.io.INVALID wrote:
> > 
> > > It's come to my attention that Flink is using Pekko's classical remoting,
> > > if this is the case then I would recommend making a response at
> > > https://lists.apache.org/thread/19h2wrs2om91g5vhnftv583fo0ddfshm .
> > > 
> > > Quick summary of what is being discussed is what to do with Pekko's
> > > classical remoting. Classic remoting is considered deprecated since 2019,
> > > an artifact that we inherited from Akka1. Ontop of this classical
> > > remoting happens to be using netty3 which has known CVE's2, these CVE's
> > > were never fixed in the netty3 series.
> > > 
> > > The question is what should be done given this, i.e. some people in the
> > > Pekko community are wanting to drop classical remoting as quickly as
> > > possible (i.e. even sooner then what semver allows but this is being
> > > discussed) and others are wanting to leave it as it is (even with the
> > > CVE's) since we don't want to incentivize and/or create impression that we
> > > are officially supporting it. There is also a currently open PR3 which
> > > upgrades Pekko's classical remoting's from netty3 to netty4 with the
> > > primary motivator being removing said CVE's.
> > > 
> > > My personal position on the matter is that Pekko shouldn't drop classical
> > > remoting until 2.0.x (to satisfy semver) while also updating Pekko's
> > > classical remoting netty dependency to netty4 so that we are not shipping
> > > Pekko with known CVE's (if this gets approved such a change would likely
> > > land in Pekko 1.1.0). As is customary, such a decision should be agreed
> > > upon broadly in the Pekko community.
> > > 
> > > Note that regardless of this change, it's recommended that a plan should 
> > > be
> > > made at some point by Flink to move from classical remoting to artery4
> > > although the decision that Pekko ultimately makes may influence the
> > > timeline (hence the reason for this thread).
> > > 
> > > --
> > > 
> > > Matthew de Detrich
> > > 
> > > Aiven Deutschland GmbH
> > > 
> > > Immanuelkirchstraße 26, 10405 Berlin
> > > 
> > > Amtsgericht Charlottenburg, HRB 209739 B
> > > 
> > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > > 
> > > m: +491603708037
> > > 
> > > w: aiven.io e: matthew.dedetr...@aiven.io

Re: Future of classical remoting in Pekko

2023-09-12 Thread Ferenc Csaky

Hi Matthew,

Thanks for bringing this up! Cca half a year ago I started to work on an Akka 
Artery migration, there is a draft PR for that [1]. It might be an option to 
revive that work and point it against Pekko instead. Although I would highlight 
FLINK-29281 [2] which will replace the whole RPC implementation in Flink to a 
gRPC-based one when it is done.

I am not sure about the progess on the gRPC work, it looks hanging for a while 
now, so I think if there is a chance to replace Netty3 with Netty4 in Pekko in 
the short term it would benefit Flink and then we can decide if it would worth 
to upgrade to Artery, or how fast the gRPC solution can be done and then it 
will not be necessary.

All in all, in the short term I think Flink would benefit to have that 
mentioned PR [3] merged, then the updated Pekko version could be included in 
the first 1.18 patch probably to mitigate those pesky Netty3 CVEs that are 
carried for a while ASAP.

Cheers,
Ferenc

[1] https://github.com/apache/flink/pull/22271
[2] https://issues.apache.org/jira/browse/FLINK-29281
[3] https://github.com/apache/incubator-pekko/pull/643


--- Original Message ---
On Tuesday, September 12th, 2023 at 10:29, Matthew de Detrich 
 wrote:


> 
> 
> It's come to my attention that Flink is using Pekko's classical remoting,
> if this is the case then I would recommend making a response at
> https://lists.apache.org/thread/19h2wrs2om91g5vhnftv583fo0ddfshm .
> 
> Quick summary of what is being discussed is what to do with Pekko's
> classical remoting. Classic remoting is considered deprecated since 2019,
> an artifact that we inherited from Akka[1]. Ontop of this classical
> remoting happens to be using netty3 which has known CVE's[2], these CVE's
> were never fixed in the netty3 series.
> 
> The question is what should be done given this, i.e. some people in the
> Pekko community are wanting to drop classical remoting as quickly as
> possible (i.e. even sooner then what semver allows but this is being
> discussed) and others are wanting to leave it as it is (even with the
> CVE's) since we don't want to incentivize and/or create impression that we
> are officially supporting it. There is also a currently open PR[3] which
> upgrades Pekko's classical remoting's from netty3 to netty4 with the
> primary motivator being removing said CVE's.
> 
> My personal position on the matter is that Pekko shouldn't drop classical
> remoting until 2.0.x (to satisfy semver) while also updating Pekko's
> classical remoting netty dependency to netty4 so that we are not shipping
> Pekko with known CVE's (if this gets approved such a change would likely
> land in Pekko 1.1.0). As is customary, such a decision should be agreed
> upon broadly in the Pekko community.
> 
> Note that regardless of this change, it's recommended that a plan should be
> made at some point by Flink to move from classical remoting to artery[4]
> although the decision that Pekko ultimately makes may influence the
> timeline (hence the reason for this thread).
> 
> [1]: https://github.com/akka/akka/issues/31764
> [2]: https://mvnrepository.com/artifact/io.netty/netty/3.10.6.Final
> [3]: https://github.com/apache/incubator-pekko/pull/643
> [4]: https://pekko.apache.org/docs/pekko/current/remoting-artery.html
> 
> --
> 
> Matthew de Detrich
> 
> Aiven Deutschland GmbH
> 
> Immanuelkirchstraße 26, 10405 Berlin
> 
> Amtsgericht Charlottenburg, HRB 209739 B
> 
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> 
> m: +491603708037
> 
> w: aiven.io e: matthew.dedetr...@aiven.io

Re: [VOTE] Release flink-connector-hbase v3.0.0, release candidate 2

2023-09-05 Thread Ferenc Csaky

Hi,

Thanks Martijn for initiating the release!

+1 (non-binding)

- checked signatures and checksums
- checked source has no binaries
- checked LICENSE and NOTICE files
- approved web PR

Cheers,
Ferenc




--- Original Message ---
On Monday, September 4th, 2023 at 12:54, Samrat Deb  
wrote:


> 
> 
> Hi,
> 
> +1 (non-binding)
> 
> Verified NOTICE files
> Verified CheckSum and signatures
> Glanced through PR[1] , Looks good to me
> 
> Bests,
> Samrat
> 
> [1]https://github.com/apache/flink-web/pull/591
> 
> 
> > On 04-Sep-2023, at 2:22 PM, Ahmed Hamdy hamdy10...@gmail.com wrote:
> > 
> > Hi Martijn,
> > +1 (non-binding)
> > 
> > - verified Checksums and signatures
> > - no binaries in source
> > - Checked NOTICE files contains migrated artifacts
> > - tag is correct
> > - Approved Web PR
> > 
> > Best Regards
> > Ahmed Hamdy
> > 
> > On Fri, 1 Sept 2023 at 15:35, Martijn Visser martijnvis...@apache.org
> > wrote:
> > 
> > > Hi everyone,
> > > 
> > > Please review and vote on the release candidate #2 for the version 3.0.0,
> > > as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > > 
> > > The complete staging area is available for your review, which includes:
> > > * JIRA release notes [1],
> > > * the official Apache source release to be deployed to dist.apache.org
> > > [2],
> > > which are signed with the key with fingerprint
> > > A5F3BCE4CBE993573EC5966A65321B8382B219AF [3],
> > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > * source code tag v3.0.0-rc2 [5],
> > > * website pull request listing the new release [6].
> > > 
> > > This replaces the old, cancelled vote of RC1 [7]. This version is the
> > > externalized version which is compatible with Flink 1.16 and 1.17.
> > > 
> > > The vote will be open for at least 72 hours. It is adopted by majority
> > > approval, with at least 3 PMC affirmative votes.
> > > 
> > > Thanks,
> > > Release Manager
> > > 
> > > [1]
> > > 
> > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352578
> > > [2]
> > > 
> > > https://dist.apache.org/repos/dist/dev/flink/flink-connector-hbase-3.0.0-rc2
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [4] https://repository.apache.org/content/repositories/orgapacheflink-1650
> > > [5]
> > > https://github.com/apache/flink-connector-hbase/releases/tag/v3.0.0-rc2
> > > [6] https://github.com/apache/flink-web/pull/591
> > > [7] https://lists.apache.org/thread/wbl6sc86q9s5mmz5slx4z09svh91cpr0

[jira] [Created] (FLINK-32811) Add port range support for taskmanager.data.bind-port

2023-08-08 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-32811:


 Summary: Add port range support for taskmanager.data.bind-port
 Key: FLINK-32811
 URL: https://issues.apache.org/jira/browse/FLINK-32811
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Configuration, Runtime / Coordination
Reporter: Ferenc Csaky
 Fix For: 1.19.0


Adding this feature could be helpful for installation in a restrictive network 
setup. The "port range" support is already available for some other port config 
options anyway.

Right now, it is possible to specify a {{taskmanager.data.port}} and 
{{taskmanager.data.bind-port}} to be able to support NAT-like setups, although 
{{taskmanager.data.port}} is not bound to anything itself, so supporting a port 
range there is not an option according to my understanding.

Although, supporting a port range only for {{taskmanager.data.bind-port}} can 
be still helpful for anyone who does not require a NAT capability, because if 
{{taskmanager.data.bind-port}} is set and {{taskmanager.data.port}} is set to 
*0*, then the bound port will be used everywhere.

This change should keep the already possible setups working as is.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [ANNOUNCE] Flink 1.18 feature freeze

2023-08-07 Thread Ferenc Csaky

Hi,

I would like to ask if releasing the external HBase connector and remove those 
modules from the core Flink repo before Flink 1.18 sounds feasible at this 
point or not?

The connector source is ready for quite a while, I externalized the E2E tests a 
couple weeks before and CI is passing so in theory there is no remaining task 
to release the connector.

Martijn manages the tasks regarding connector externalization, but he is on 
holiday at the moment, so if this is something that can be done, is there 
anyone who can help with the connector release? I volunteer to help.

Thanks,
Ferenc

--- Original Message ---
On Wednesday, August 2nd, 2023 at 03:27, Jing Ge  
wrote:


> 
> 
> Hi,
> 
> Thanks for keeping us in the loop. +1 for merging these PRs.
> 
> @Alex Please feel free to contact us for CR and PR merges.
> 
> @Matthias the Pekko migration is an important task, since it is related to
> CVE issues. Thanks for bringing it to our attention.
> 
> Best regards,
> Jing
> 
> 
> 
> On Tue, Aug 1, 2023 at 5:23 PM Qingsheng Ren re...@apache.org wrote:
> 
> > Thanks for letting us know, Matthias!
> > 
> > As discussed in the release sync, +1 for merging these PRs.
> > 
> > Best,
> > Qingsheng
> > 
> > On Tue, Aug 1, 2023 at 5:17 PM Matthias Pohl matthias.p...@aiven.io
> > wrote:
> > 
> > > I'm requesting to merge in FLINK-32098 [1]. It's a minor change that
> > > reduces the amount of exists calls to S3 while submitting a job (which
> > > can
> > > be an expensive operation if the object actually doesn't exist but the
> > > corresponding bucket itself contains a lot of objects). The PR is
> > > reviewed
> > > and ready to be merged. The change itself is minor and covered by
> > > existing
> > > tests.
> > > 
> > > Additionally, I want to mention the two already merged (after
> > > feature-freeze) changes:
> > > - FLINK-32583 [2] which is a minor bugfix. I didn't explicitly mention it
> > > initially because of the fact that it fixes a bug (which admittedly was
> > > already present in older versions of Flink). I'm happy to revert that one
> > > if the release managers have concerns. It's fixing a scenario where the
> > > RestClient becomes unresponsive when submitting a request in rare cases.
> > > - Migration from Akka to Pekko (FLINK-32468 [3], FLINK-32683 [4]): This
> > > was agreed on in last week's 1.18 release sync. I just forgot to make it
> > > public on the ML. The Pekko change will be "release-tested" as part of
> > > FLINK-32678 stress test.
> > > 
> > > Matthias
> > > 
> > > [1] https://issues.apache.org/jira/browse/FLINK-32098
> > > [2] https://issues.apache.org/jira/browse/FLINK-32583
> > > [3] https://issues.apache.org/jira/browse/FLINK-32468
> > > [4] https://issues.apache.org/jira/browse/FLINK-32683
> > > [5] https://issues.apache.org/jira/browse/FLINK-32678
> > > 
> > > On Tue, Aug 1, 2023 at 10:58 AM Qingsheng Ren re...@apache.org wrote:
> > > 
> > > > Thanks for letting us know, Alexander and Leonard!
> > > > 
> > > > I checked these PRs and the changes are trivial. +1 for merging them.
> > > > 
> > > > Best,
> > > > Qingsheng
> > > > 
> > > > On Tue, Aug 1, 2023 at 12:14 AM Leonard Xu xbjt...@gmail.com wrote:
> > > > 
> > > > > Thank all Release Managers for driving the 1.18 release work!
> > > > > 
> > > > > > - Deprecation works for 2.0
> > > > > > 
> > > > > > As discussed in another thread [3], we will not give extra
> > > > > > extensions
> > > > > > to
> > > > > > deprecation works considering the overhead and potential side
> > > > > > effects
> > > > > > to
> > > > > > the timeline of 1.18. We can accept tiny changes that only add
> > > > > > annotations
> > > > > > and JavaDocs, but please let us know before you are going to do
> > > > > > that.
> > > > > 
> > > > > Alexander and I ready to deprecate SourceFunction APIs as above
> > > > > discussion
> > > > > thread[1][2], now we apply the permission to merge following three
> > > > > PRs :
> > > > > 
> > > > > The first two PRs [3][4] only contains @Deprecated annotations and
> > > > > JavaDocs, the PR[5] contains @Deprecated annotations, JavaDocs, and
> > > > > necessary tiny changes for example code as some examples with strict
> > > > > deprecation compiler checks to SourceFunction API, it should be okay
> > > > > as
> > > > > it
> > > > > only changed example code, you can check the tiny change in this
> > > > > commit[5].
> > > > > 
> > > > > Best,
> > > > > Alexander and Leonard
> > > > > 
> > > > > [1]https://lists.apache.org/thread/yyw52k45x2sp1jszldtdx7hc98n72w7k
> > > > > [2]https://lists.apache.org/thread/kv9rj3w2rmkb8jtss5bqffhw57or7v8v
> > > > > [3]https://github.com/apache/flink/pull/23106
> > > > > [4]https://github.com/apache/flink/pull/23079
> > > > > [5]https://github.com/apache/flink/pull/23105
> > > > > [6]
> > 
> > https://github.com/apache/flink/pull/23079/commits/f5ea3c073d36f21fb4fe47e83c717ac080995509

[jira] [Created] (FLINK-32660) Support external file systems in FileCatalogStore

2023-07-24 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-32660:


 Summary: Support external file systems in FileCatalogStore
 Key: FLINK-32660
 URL: https://issues.apache.org/jira/browse/FLINK-32660
 Project: Flink
  Issue Type: Sub-task
Reporter: Ferenc Csaky
 Fix For: 1.18.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] Persistent SQL Gateway

2023-06-29 Thread Ferenc Csaky

Hi Shammon,

Thank you for your answer and explanation, my latest experiment was a SELECT 
query and my assumptions were based on that, INSERT works as described.

Regarding the state of FLIP-295, I just checked out the recently created jiras 
[1] and if I can help out with any part, please let me know.

Cheers,
F

[1] https://issues.apache.org/jira/browse/FLINK-32427


--- Original Message ---
On Tuesday, June 27th, 2023 at 13:39, Shammon FY  wrote:


> 
> 
> Hi Ferenc,
> 
> If I understand correctly, there will be two types of jobs in sql-gateway:
> `SELECT` and `NON-SELECT` such as `INSERT`.
> 
> 1. `SELECT` jobs need to collect results from Flink cluster in a
> corresponding session of sql gateway, and when the session is closed, the
> job should be canceled. These jobs are generally short queries similar to
> OLAP and I think it may be acceptable.
> 
> 2. `NON-SELECT` jobs may be batch or streaming jobs, and when the jobs are
> submitted successfully, they won't be killed or canceled even if the
> session or sql-gateway is closed. After these assignments are successfully
> submitted, the lifecycle is no longer managed by SQL gateway.
> 
> I don't know if it covers your usage scenario. Could you describe yours for
> us to test and confirm?
> 
> Best,
> Shammon FY
> 
> 
> On Tue, Jun 27, 2023 at 6:43 PM Ferenc Csaky ferenc.cs...@pm.me.invalid
> 
> wrote:
> 
> > Hi Jark,
> > 
> > In the current implementation, any job submitted via the SQL Gateway has
> > to be done through a session, cause all the operations are grouped under
> > sessions.
> > 
> > Starting from there, if I close a session, that will close the
> > "SessionContext", which closes the "OperationManager" [1], and the
> > "OperationManager" closes all submitted operations tied to that session
> > [2], which results closing all the jobs executed in the session.
> > 
> > Maybe I am missing something, but my experience is that the jobs I submit
> > via the SQL Gateway are getting cleaned up on gateway session close.
> > 
> > WDYT?
> > 
> > Cheers,
> > F
> > 
> > [1]
> > https://github.com/apache/flink/blob/149a5e34c1ed8d8943c901a98c65c70693915811/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/context/SessionContext.java#L204
> > [2]
> > https://github.com/apache/flink/blob/149a5e34c1ed8d8943c901a98c65c70693915811/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/operation/OperationManager.java#L194
> > 
> > --- Original Message ---
> > On Tuesday, June 27th, 2023 at 04:37, Jark Wu imj...@gmail.com wrote:
> > 
> > > Hi Ferenc,
> > > 
> > > But the job lifecycle doesn't tie to the SQL Gateway session.
> > > Even if the session is closed, all the running jobs are not affected.
> > > 
> > > Best,
> > > Jark
> > > 
> > > On Tue, 27 Jun 2023 at 04:14, Ferenc Csaky ferenc.cs...@pm.me.invalid
> > > 
> > > wrote:
> > > 
> > > > Hi Jark,
> > > > 
> > > > Thank you for pointing out FLIP-295 abouth catalog persistence, I was
> > > > not
> > > > aware the current state. Although as far as I see, that persistent
> > > > catalogs
> > > > are necessary, but not sufficient achieving a "persistent gateway".
> > > > 
> > > > The current implementation ties the job lifecycle to the SQL gateway
> > > > session, so if it gets closed, it will cancel all the jobs. So that
> > > > would
> > > > be the next step I think. Any work or thought regarding this aspect?
> > > > We are
> > > > definitely willing to help out on this front.
> > > > 
> > > > Cheers,
> > > > F
> > > > 
> > > > --- Original Message ---
> > > > On Sunday, June 25th, 2023 at 06:23, Jark Wu imj...@gmail.com wrote:
> > > > 
> > > > > Hi Ferenc,
> > > > > 
> > > > > Making SQL Gateway to be an easy-to-use platform infrastructure of
> > > > > Flink
> > > > > SQL
> > > > > is one of the important roadmaps 1.
> > > > > 
> > > > > The persistence ability of the SQL Gateway is a major work in 1.18
> > > > > release.
> > > > > One of the persistence demand is that the registered catalogs are
> > > > > currently
> > > > > kept in memory and lost when Gateway restarts. There is an accepted
> >

Re: [DISCUSS] Persistent SQL Gateway

2023-06-27 Thread Ferenc Csaky

Hi Jark,

In the current implementation, any job submitted via the SQL Gateway has to be 
done through a session, cause all the operations are grouped under sessions.

Starting from there, if I close a session, that will close the 
"SessionContext", which closes the "OperationManager" [1], and the 
"OperationManager" closes all submitted operations tied to that session [2], 
which results closing all the jobs executed in the session.

Maybe I am missing something, but my experience is that the jobs I submit via 
the SQL Gateway are getting cleaned up on gateway session close.

WDYT?

Cheers,
F

[1] 
https://github.com/apache/flink/blob/149a5e34c1ed8d8943c901a98c65c70693915811/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/context/SessionContext.java#L204
[2] 
https://github.com/apache/flink/blob/149a5e34c1ed8d8943c901a98c65c70693915811/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/operation/OperationManager.java#L194



--- Original Message ---
On Tuesday, June 27th, 2023 at 04:37, Jark Wu  wrote:


> 
> 
> Hi Ferenc,
> 
> But the job lifecycle doesn't tie to the SQL Gateway session.
> Even if the session is closed, all the running jobs are not affected.
> 
> Best,
> Jark
> 
> 
> 
> 
> On Tue, 27 Jun 2023 at 04:14, Ferenc Csaky ferenc.cs...@pm.me.invalid
> 
> wrote:
> 
> > Hi Jark,
> > 
> > Thank you for pointing out FLIP-295 abouth catalog persistence, I was not
> > aware the current state. Although as far as I see, that persistent catalogs
> > are necessary, but not sufficient achieving a "persistent gateway".
> > 
> > The current implementation ties the job lifecycle to the SQL gateway
> > session, so if it gets closed, it will cancel all the jobs. So that would
> > be the next step I think. Any work or thought regarding this aspect? We are
> > definitely willing to help out on this front.
> > 
> > Cheers,
> > F
> > 
> > --- Original Message ---
> > On Sunday, June 25th, 2023 at 06:23, Jark Wu imj...@gmail.com wrote:
> > 
> > > Hi Ferenc,
> > > 
> > > Making SQL Gateway to be an easy-to-use platform infrastructure of Flink
> > > SQL
> > > is one of the important roadmaps 1.
> > > 
> > > The persistence ability of the SQL Gateway is a major work in 1.18
> > > release.
> > > One of the persistence demand is that the registered catalogs are
> > > currently
> > > kept in memory and lost when Gateway restarts. There is an accepted FLIP
> > > (FLIP-295)[2] target to resolve this issue and make Gateway can persist
> > > the
> > > registered catalogs information into files or databases.
> > > 
> > > I'm not sure whether this is something you are looking for?
> > > 
> > > Best,
> > > Jark
> > > 
> > > [2]:
> > 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
> > 
> > > On Fri, 23 Jun 2023 at 00:25, Ferenc Csaky ferenc.cs...@pm.me.invalid
> > > 
> > > wrote:
> > > 
> > > > Hello devs,
> > > > 
> > > > I would like to open a discussion about persistence possibilitis for
> > > > the
> > > > SQL Gateway. At Cloudera, we are happy to see the work already done on
> > > > this
> > > > project and looking for ways to utilize it on our platform as well, but
> > > > currently it lacks some features that would be essential in our case,
> > > > where
> > > > we could help out.
> > > > 
> > > > I am not sure if any thought went into gateway persistence specifics
> > > > already, and this feature could be implemented in fundamentally
> > > > differnt
> > > > ways, so I think the frist step could be to agree on the basics.
> > > > 
> > > > First, in my opinion, persistence should be an optional feature of the
> > > > gateway, that can be enabled if desired. There can be a lot of
> > > > implementation details, but there can be some major directions to
> > > > follow:
> > > > 
> > > > - Utilize Hive catalog: The Hive catalog can already be used to have
> > > > persistenct meta-objects, so the crucial thing that would be missing in
> > > > this case is other catalogs. Personally, I would not pursue this
> > > > option,
> > > > because in my opinion it would limit the usability of this feature to

Re: [DISCUSS] Persistent SQL Gateway

2023-06-26 Thread Ferenc Csaky

Hi Jark,

Thank you for pointing out FLIP-295 abouth catalog persistence, I was not aware 
the current state. Although as far as I see, that persistent catalogs are 
necessary, but not sufficient achieving a "persistent gateway".

The current implementation ties the job lifecycle to the SQL gateway session, 
so if it gets closed, it will cancel all the jobs. So that would be the next 
step I think. Any work or thought regarding this aspect? We are definitely 
willing to help out on this front.

Cheers,
F


--- Original Message ---
On Sunday, June 25th, 2023 at 06:23, Jark Wu  wrote:


>
>
> Hi Ferenc,
>
> Making SQL Gateway to be an easy-to-use platform infrastructure of Flink
> SQL
> is one of the important roadmaps [1].
>
> The persistence ability of the SQL Gateway is a major work in 1.18 release.
> One of the persistence demand is that the registered catalogs are currently
> kept in memory and lost when Gateway restarts. There is an accepted FLIP
> (FLIP-295)[2] target to resolve this issue and make Gateway can persist the
> registered catalogs information into files or databases.
>
> I'm not sure whether this is something you are looking for?
>
> Best,
> Jark
>
>
> [1]: https://flink.apache.org/roadmap/#a-unified-sql-platform
> [2]:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
>
> On Fri, 23 Jun 2023 at 00:25, Ferenc Csaky ferenc.cs...@pm.me.invalid
>
> wrote:
>
> > Hello devs,
> >
> > I would like to open a discussion about persistence possibilitis for the
> > SQL Gateway. At Cloudera, we are happy to see the work already done on this
> > project and looking for ways to utilize it on our platform as well, but
> > currently it lacks some features that would be essential in our case, where
> > we could help out.
> >
> > I am not sure if any thought went into gateway persistence specifics
> > already, and this feature could be implemented in fundamentally differnt
> > ways, so I think the frist step could be to agree on the basics.
> >
> > First, in my opinion, persistence should be an optional feature of the
> > gateway, that can be enabled if desired. There can be a lot of
> > implementation details, but there can be some major directions to follow:
> >
> > - Utilize Hive catalog: The Hive catalog can already be used to have
> > persistenct meta-objects, so the crucial thing that would be missing in
> > this case is other catalogs. Personally, I would not pursue this option,
> > because in my opinion it would limit the usability of this feature too much.
> > - Serialize the session as is: Saving the whole session (or its context)
> > [1] as is to durable storage, so it can be kept and picked up again.
> > - Serialize the required elements (catalogs, tables, functions, etc.), not
> > necessarily as a whole: The main point here would be to serialize a
> > different object, so the persistent data will not be that sensitive to
> > changes of the session (or its context). There can be numerous factors
> > here, like try to keep the model close to the session itself, so the
> > boilerplate required for the mapping can be kept to minimal, or focus on
> > saving what is actually necessary, making the persistent storage more
> > portable.
> >
> > WDYT?
> >
> > Cheers,
> > F
> >
> > [1]
> > https://github.com/apache/flink/blob/master/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/session/Session.java

[DISCUSS] Persistent SQL Gateway

2023-06-22 Thread Ferenc Csaky

Hello devs,

I would like to open a discussion about persistence possibilitis for the SQL 
Gateway. At Cloudera, we are happy to see the work already done on this project 
and looking for ways to utilize it on our platform as well, but currently it 
lacks some features that would be essential in our case, where we could help 
out.

I am not sure if any thought went into gateway persistence specifics already, 
and this feature could be implemented in fundamentally differnt ways, so I 
think the frist step could be to agree on the basics.

First, in my opinion, persistence should be an optional feature of the gateway, 
that can be enabled if desired. There can be a lot of implementation details, 
but there can be some major directions to follow:

- Utilize Hive catalog: The Hive catalog can already be used to have 
persistenct meta-objects, so the crucial thing that would be missing in this 
case is other catalogs. Personally, I would not pursue this option, because in 
my opinion it would limit the usability of this feature too much.
- Serialize the session as is: Saving the whole session (or its context) [1] as 
is to durable storage, so it can be kept and picked up again.
- Serialize the required elements (catalogs, tables, functions, etc.), not 
necessarily as a whole: The main point here would be to serialize a different 
object, so the persistent data will not be that sensitive to changes of the 
session (or its context). There can be numerous factors here, like try to keep 
the model close to the session itself, so the boilerplate required for the 
mapping can be kept to minimal, or focus on saving what is actually necessary, 
making the persistent storage more portable.

WDYT?

Cheers,
F

[1] 
https://github.com/apache/flink/blob/master/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/session/Session.java

[jira] [Created] (FLINK-32174) Update Cloudera product and link in doc page

2023-05-24 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-32174:


 Summary: Update Cloudera product and link in doc page
 Key: FLINK-32174
 URL: https://issues.apache.org/jira/browse/FLINK-32174
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Ferenc Csaky






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-31085) Add schema option to confluent registry avro formats

2023-02-15 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-31085:


 Summary: Add schema option to confluent registry avro formats
 Key: FLINK-31085
 URL: https://issues.apache.org/jira/browse/FLINK-31085
 Project: Flink
  Issue Type: Improvement
Reporter: Ferenc Csaky
 Fix For: 1.17.0


When using {{avro-confluent}} and {{debezium-avro-confluent}} formats with 
schemas already defined in the Confluent Schema Registry, serialization fails, 
because Flink uses a default name `record` when converting row types to avro 
schema. So if the predefined schema has a different name, the serialization 
schema will be incompatible with the registered schema due to name mismatch. 
Check [this|https://lists.apache.org/thread/5xppmnqjqwfzxqo4gvd3lzz8wzs566zp] 
thread about reproducing the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] Release flink-connector-hbase v3.0.0, release candidate #1

2023-01-24 Thread Ferenc Csaky

Hi! 

According to the last comment on the e2e test PR [1], it would be okay to 
release the connector in its current format and deliver a new/rewritten test 
later. Every other issue should be resolved.

[1] https://github.com/apache/flink-connector-hbase/pull/5


--- Original Message ---
On Friday, December 9th, 2022 at 08:13, Martijn Visser 
 wrote:


> 
> 
> Thanks all for the check. This RC is cancelled and I'll create a new one
> when the fixes are done.
> 
> On Thu, Dec 8, 2022 at 1:22 PM Ferenc Csaky ferenc.cs...@pm.me.invalid
> 
> wrote:
> 
> > Heck, thank you Chesnay for catching this. I spent some time on removing
> > the 1.4 hbase-client from flink-connector-hbase-2.2, cause excluding it
> > from flink-connector-hbase-base seemed not working. I missed removing that
> > exclusion from the 2.2 sql connector pom. Opened a PR [1] with the fix.
> > 
> > [1] https://github.com/apache/flink-connector-hbase/pull/4
> > 
> > --- Original Message ---
> > On Thursday, December 8th, 2022 at 11:15, Chesnay Schepler <
> > ches...@apache.org> wrote:
> > 
> > > -1
> > > 
> > > The packaging of the sql-connector-connector-hbase-2.2 module has
> > > changed significantly, no longer bundling hbase-client (it now only
> > > bundles Flink classes).
> > > 
> > > On 05/12/2022 13:24, Ferenc Csaky wrote:
> > > 
> > > > Hi Martijn,
> > > > 
> > > > +1 (non-binding)
> > > > 
> > > > - Verified hashes/signatures
> > > > - Maven repo content LGTM
> > > > - No binaries in the source archive
> > > > - Built source/tests pass
> > > > - Tag exists in GH
> > > > - Reviewed web PR
> > > > 
> > > > Thanks,
> > > > F
> > > > 
> > > > --- Original Message ---
> > > > On Friday, December 2nd, 2022 at 14:04, Martijn Visser
> > > > martijnvis...@apache.org wrote:
> > > > 
> > > > > Hi everyone,
> > > > > Please review and vote on the release candidate #1 for the
> > > > > flink-connector-hbase version v3.0.0, as follows:
> > > > > [ ] +1, Approve the release
> > > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > > > 
> > > > > Note: This is the first externalized version of the HBase connector.
> > > > > 
> > > > > The complete staging area is available for your review, which
> > > > > includes:
> > > > > * JIRA release notes [1],
> > > > > * the official Apache source release to be deployed to
> > > > > dist.apache.org [2],
> > > > > which are signed with the key with fingerprint
> > > > > A5F3BCE4CBE993573EC5966A65321B8382B219AF [3],
> > > > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > > > * source code tag v3.0.0-rc1 [5],
> > > > > * website pull request listing the new release [6].
> > > > > 
> > > > > The vote will be open for at least 72 hours. It is adopted by
> > > > > majority
> > > > > approval, with at least 3 PMC affirmative votes.
> > > > > 
> > > > > Thanks,
> > > > > Martijn
> > > > > 
> > > > > https://twitter.com/MartijnVisser82
> > > > > https://github.com/MartijnVisser
> > > > > 
> > > > > [1]
> > 
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352578
> > 
> > > > > [2]
> > 
> > https://dist.apache.org/repos/dist/dev/flink/flink-connector-hbase-3.0.0-rc1/
> > 
> > > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > > [4]
> > > > > https://repository.apache.org/content/repositories/orgapacheflink-1555/
> > > > > [5]
> > > > > https://github.com/apache/flink-connector-hbase/releases/tag/v3.0.0-rc1
> > > > > [6] https://github.com/apache/flink-web/pull/591

Re: [VOTE] Release flink-connector-hbase v3.0.0, release candidate #1

2022-12-08 Thread Ferenc Csaky

Heck, thank you Chesnay for catching this. I spent some time on removing the 
1.4 hbase-client from flink-connector-hbase-2.2, cause excluding it from 
flink-connector-hbase-base seemed not working. I missed removing that exclusion 
from the 2.2 sql connector pom. Opened a PR [1] with the fix.

[1] https://github.com/apache/flink-connector-hbase/pull/4



--- Original Message ---
On Thursday, December 8th, 2022 at 11:15, Chesnay Schepler  
wrote:


> 
> 
> -1
> 
> The packaging of the sql-connector-connector-hbase-2.2 module has
> changed significantly, no longer bundling hbase-client (it now only
> bundles Flink classes).
> 
> On 05/12/2022 13:24, Ferenc Csaky wrote:
> 
> > Hi Martijn,
> > 
> > +1 (non-binding)
> > 
> > - Verified hashes/signatures
> > - Maven repo content LGTM
> > - No binaries in the source archive
> > - Built source/tests pass
> > - Tag exists in GH
> > - Reviewed web PR
> > 
> > Thanks,
> > F
> > 
> > --- Original Message ---
> > On Friday, December 2nd, 2022 at 14:04, Martijn Visser 
> > martijnvis...@apache.org wrote:
> > 
> > > Hi everyone,
> > > Please review and vote on the release candidate #1 for the
> > > flink-connector-hbase version v3.0.0, as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > > 
> > > Note: This is the first externalized version of the HBase connector.
> > > 
> > > The complete staging area is available for your review, which includes:
> > > * JIRA release notes [1],
> > > * the official Apache source release to be deployed to dist.apache.org 
> > > [2],
> > > which are signed with the key with fingerprint
> > > A5F3BCE4CBE993573EC5966A65321B8382B219AF [3],
> > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > * source code tag v3.0.0-rc1 [5],
> > > * website pull request listing the new release [6].
> > > 
> > > The vote will be open for at least 72 hours. It is adopted by majority
> > > approval, with at least 3 PMC affirmative votes.
> > > 
> > > Thanks,
> > > Martijn
> > > 
> > > https://twitter.com/MartijnVisser82
> > > https://github.com/MartijnVisser
> > > 
> > > [1]
> > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352578
> > > [2]
> > > https://dist.apache.org/repos/dist/dev/flink/flink-connector-hbase-3.0.0-rc1/
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [4] 
> > > https://repository.apache.org/content/repositories/orgapacheflink-1555/
> > > [5] 
> > > https://github.com/apache/flink-connector-hbase/releases/tag/v3.0.0-rc1
> > > [6] https://github.com/apache/flink-web/pull/591
> 
>

Re: [VOTE] Release flink-connector-hbase v3.0.0, release candidate #1

2022-12-05 Thread Ferenc Csaky

Hi Martijn,

+1 (non-binding)

- Verified hashes/signatures
- Maven repo content LGTM
- No binaries in the source archive
- Built source/tests pass
- Tag exists in GH
- Reviewed web PR

Thanks,
F


--- Original Message ---
On Friday, December 2nd, 2022 at 14:04, Martijn Visser 
 wrote:


> 
> 
> Hi everyone,
> Please review and vote on the release candidate #1 for the
> flink-connector-hbase version v3.0.0, as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
> 
> Note: This is the first externalized version of the HBase connector.
> 
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release to be deployed to dist.apache.org [2],
> which are signed with the key with fingerprint
> A5F3BCE4CBE993573EC5966A65321B8382B219AF [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag v3.0.0-rc1 [5],
> * website pull request listing the new release [6].
> 
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
> 
> Thanks,
> Martijn
> 
> https://twitter.com/MartijnVisser82
> https://github.com/MartijnVisser
> 
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352578
> [2]
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-hbase-3.0.0-rc1/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4] https://repository.apache.org/content/repositories/orgapacheflink-1555/
> [5] https://github.com/apache/flink-connector-hbase/releases/tag/v3.0.0-rc1
> [6] https://github.com/apache/flink-web/pull/591

Re: [DISCUSS] assign SQL Table properties from environment variables

2022-12-01 Thread Ferenc Csaky

Hello devs,

I'd like to revive this discussion. There is also a ticket about this effort 
for some time [1] and this thing also affects us as well. Right now we have a 
custom solution that is similar to "environment variables", but it only can be 
used in parts of our downstream product. The main thing for us to achieve would 
be to be able to use variables in DDLs (not necessarily for hiding sensitive 
props). I think it would be really handy to have the ability to reuse values in 
multiple tables.

With that said, comes the temptation to hit two birds with one stone, although 
a sensitive property requires much more care than a regular one, so I think 
these two things should be handled separately. At least in the beginning. The 
tricky part of the "environment variables" are their scope, and if they are not 
coming from an external system, it will probably be necessary to persist them. 
Or keep them in memory, but that may be insufficient according to what is the 
scope of the "environment variables".

Considering the sensitive props, I think a small step forward could be to hide 
the values in case of a "SHOW CREATE TABLE" op.

For a varible to be used in a DDL I'd imagine it could apply for a whole 
catalog as starters. As long as the catalog is present, those variables would 
be valid.

I did not check implementation details yet, so it is possible I'm missing 
something important or wrong in some places, but I wanted to get some feedback 
about the idea.

WDYT?

[1] https://issues.apache.org/jira/browse/FLINK-28028

Best,
F


--- Original Message ---
On Monday, April 4th, 2022 at 09:53, Timo Walther  wrote:


> 
> 
> Hi Fred,
> 
> thanks for starting this discussion. I totally agree that this an issue
> that the community should solve. It popped up before and is still
> unsolved today. Great that you offer your help here. So let's clarify
> the implementation details.
> 
> 1) Global vs. Local solution
> 
> Is this a DDL-only problem? If yes, it would be easier to solve it in
> the `FactoryUtil` that all Flink connectors and formats use.
> 
> 2) Configruation vs. enviornment variables
> 
> I agree with Qingsheng that environment variable are not always
> straightforward to identify if you have a "pre-flight phase" and a
> "cluster phase".
> In the DynamicTableFactory, one has access to Flink configuration and
> could resolve `${...}` variables.
> 
> 
> What do you think?
> 
> Regards,
> Timo
> 
> 
> Am 01.04.22 um 12:26 schrieb Qingsheng Ren:
> 
> > Hi Fred,
> > 
> > Thanks for raising the discussion! I think the definition of “environment 
> > variable” varies under different context. Under Flink on K8s it means the 
> > environment variable for a container, and if you are a SQL client user it 
> > could refer to environment variable of SQL client, or even the system 
> > properties on JVM. So using “environment variable” is a bit vague under 
> > different environments.
> > 
> > A more generic solution in my mind is that we can take advantage of 
> > configurations in Flink, to pass table options dynamically by adding 
> > configs to TableConfig or even flink-conf.yaml. For example option 
> > “table.dynamic.options.my_catalog.my_db_.my_table.accessId = foo” means 
> > adding table option “accessId = foo” to table “my_catalog.my_db.my_table”. 
> > By this way we could de-couple DDL statement with table options containing 
> > secret credentials. What do you think?
> > 
> > Best regards,
> > 
> > Qingsheng
> > 
> > > On Mar 30, 2022, at 16:25, Teunissen, F.G.J. (Fred) 
> > > fred.teunis...@ing.com.INVALID wrote:
> > > 
> > > Hi devs,
> > > 
> > > Some SQL Table properties contain sensitive data, like passwords that we 
> > > do not want to expose in the VVP ui to other users. Also, having them 
> > > clear text in a SQL statement is not secure. For example,
> > > 
> > > CREATE TABLE Orders (
> > > `user` BIGINT,
> > > product STRING,
> > > order_time TIMESTAMP(3)
> > > ) WITH (
> > > 'connector' = 'kafka',
> > > 
> > > 'properties.bootstrap.servers' = 'kafka-host-1:9093,kafka-host-2:9093',
> > > 'properties.security.protocol' = 'SSL',
> > > 'properties.ssl.key.password' = 'should-be-a-secret',
> > > 'properties.ssl.keystore.location' = '/tmp/secrets/my-keystore.jks',
> > > 'properties.ssl.keystore.password' = 'should-also-be-a-secret',
> > > 'properties.ssl.truststore.location' = '/tmp/secrets/my-truststore.jks',
> > > 'properties.ssl.truststore.password' = 'should-again-be-a-secret',
> > > 'scan.startup.mode' = 'earliest-offset'
> > > );
> > > 
> > > I would like to bring up for a discussion a proposal to provide these 
> > > secrets values via environment variables since these can be populated 
> > > from a K8s configMap or secrets.
> > > 
> > > For implementing the SQL Table properties, the ConfigOption class is 
> > > used in connectors and formatters. This class could be extended that it 
> > > checks whether the config-value contains certain tokens, like 
> > > ‘${env-var-name}’. If

Re: [DISCUSS] Retroactively externalize some connectors for 1.16

2022-12-01 Thread Ferenc Csaky

Hi!

I think this would be a good idea. I was wondering that could we include the 
hbase connector to this group as well? The externalization PR [1] should be in 
a good shape now and Dec 9th as a release date sounds doable.

WDYT?

[1] https://github.com/apache/flink-connector-hbase/pull/2

Best,
F




--- Original Message ---
On Thursday, December 1st, 2022 at 16:01, Chesnay Schepler  
wrote:


> 
> 
> Hello,
> 
> let me clarify the title first.
> 
> In the original proposal for the connector externalization we said that
> an externalized connector has to exist in parallel with the version
> shipped in the main Flink release for 1 cycle.
> 
> For example, 1.16.0 shipped with the elasticsearch connector, but at the
> same time there's the externalized variant as a drop-in replacement, and
> the 1.17.0 release will not include a ES connector.
> 
> The rational was to give users some window to update their projects.
> 
> 
> We are now about to externalize a few more connectors (cassandra,
> pulsar, jdbc), targeting 1.16 within the next week.
> The 1.16.0 release has now been about a month ago; so it hasn't been a
> lot of time since then.
> I'm now wondering if we could/should treat these connectors as
> externalized for 1.16, meaning that we would remove them from the master
> branch now, not ship them in 1.17 and move all further development into
> the connector repos.
> 
> The main benefit is that we won't have to bother with syncing changes
> across repos all the time.
> 
> We would of course need some sort-of cutoff date for this (December
> 9th?), to ensure there's still some reasonably large gap left for users
> to migrate.
> 
> Let me know what you think.
> 
> Regards,
> Chesnay

Re: [VOTE] FLIP-271: Autoscaling

2022-11-29 Thread Ferenc Csaky

+1 (non-binding)




--- Original Message ---
On Tuesday, November 29th, 2022 at 15:39, Márton Balassi 
 wrote:


> 
> 
> +1 (binding)
> 
> On Tue, Nov 29, 2022 at 6:13 AM Chenya Zhang chenyazhangche...@gmail.com
> 
> wrote:
> 
> > +1 (non-binding)
> > 
> > On Sun, Nov 27, 2022 at 5:49 PM Jiangang Liu liujiangangp...@gmail.com
> > wrote:
> > 
> > > +1 (non-binding)
> > > 
> > > Best,
> > > Jiangang Liu
> > > 
> > > Thomas Weise t...@apache.org 于2022年11月28日周一 06:23写道：
> > > 
> > > > +1 (binding)
> > > > 
> > > > On Sat, Nov 26, 2022 at 8:11 AM Zheng Yu Chen jam.gz...@gmail.com
> > > > wrote:
> > > > 
> > > > > +1(no-binding)
> > > > > 
> > > > > Maximilian Michels m...@apache.org 于 2022年11月24日周四 上午12:25写道：
> > > > > 
> > > > > > Hi everyone,
> > > > > > 
> > > > > > I'd like to start a vote for FLIP-271 [1] which we previously
> > > > > > discussed
> > > > > > on
> > > > > > the dev mailing list [2].
> > > > > > 
> > > > > > I'm planning to keep the vote open for at least until Tuesday, Nov
> > > > > > 29.
> > > > > > 
> > > > > > -Max
> > > > > > 
> > > > > > [1]
> > 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-271%3A+Autoscaling
> > 
> > > > > > [2]
> > > > > > https://lists.apache.org/thread/pvfb3fw99mj8r1x8zzyxgvk4dcppwssz

Re: [VOTE] FLIP-272: Generalized delegation token support

2022-11-21 Thread Ferenc Csaky

+1 (non-binding)

Best,
F


--- Original Message ---
On Friday, November 18th, 2022 at 16:11, Márton Balassi 
 wrote:


> 
> 
> +1 (binding)
> 
> On Thu, Nov 17, 2022 at 9:06 AM Gabor Somogyi gabor.g.somo...@gmail.com
> 
> wrote:
> 
> > Hi All,
> > 
> > I'm hereby opening a vote for FLIP-272 Generalized delegation token
> > support.
> > The related documents can be found here:
> > - FLIP on wiki: [1]
> > - Discussion thread: [2]
> > 
> > Voting will be open for at least 72 hours (since weekend is involved EOB
> > Monday is planned).
> > 
> > BR,
> > G
> > 
> > [1]
> > 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-272%3A+Generalized+delegation+token+support
> > [2] https://lists.apache.org/thread/vgg5hbf5jljcxopfhb32w3l0wjoyko4o

Re: [ANNOUNCE] New Apache Flink Committer - Matyas Orhidi

2022-11-21 Thread Ferenc Csaky

Congrats Matyas!

Best,
F



--- Original Message ---
On Monday, November 21st, 2022 at 15:17, Márton Balassi  
wrote:


> 
> 
> Hi everyone,
> 
> On behalf of the PMC, I'm very happy to announce Matyas Orhidi as a new
> Flink
> committer.
> 
> Matyas has over a decade of experience of the Big Data ecosystem and has
> been working with Flink full time for the past 3 years. In the open source
> community he is one of the key driving members of the Kubernetes Operator
> subproject. He implemented multiple key features in the operator including
> the metrics system and the ability to dynamically configure watched
> namespaces. He enjoys spreading the word about Flink and regularly does so
> via authoring blogposts and giving talks or interviews representing the
> community.
> 
> Please join me in congratulating Matyas for becoming a Flink committer!
> 
> Best,
> Marton

Re: [DISCUSS] Updating Flink HBase Connectors

2022-11-21 Thread Ferenc Csaky

Hi!

Sure, thank you both for the input. If the repo is ready, I'll start the work.

Best,
F


--- Original Message ---
On Thursday, November 17th, 2022 at 09:39, Gabor Somogyi 
 wrote:


> 
> 
> Hi All,
> 
> +1 to go. Since we are refurbishing the HBase area maybe we can move the
> token provider into HBase base project.
> This would fit into the high level effort to extract everything into
> external connectors. If you do this and facing any issues just ping me :)
> 
> BR,
> G
> 
> 
> On Thu, Nov 17, 2022 at 9:29 AM Martijn Visser martijnvis...@apache.org
> 
> wrote:
> 
> > Hi Ferenc,
> > 
> > I think you're good to go, since no comments were there. Do let us know if
> > you need any help :)
> > 
> > Thanks,
> > 
> > Martijn
> > 
> > On Mon, Oct 24, 2022 at 7:47 PM Ferenc Csaky ferenc.cs...@pm.me.invalid
> > wrote:
> > 
> > > Hi,
> > > 
> > > just pinging this thread in case someone missed it and has any opinion
> > > about the discussed actions.
> > > 
> > > Best,
> > > F
> > > 
> > > --- Original Message ---
> > > On Tuesday, October 11th, 2022 at 23:29, Ferenc Csaky
> > > ferenc.cs...@pm.me.INVALID wrote:
> > > 
> > > > Hi Martijn,
> > > > 
> > > > Thank you for your comment. About HBase 2.x, correct, that is my
> > > > thought
> > > > process, but it has to be tested and verified.
> > > > 
> > > > +1 from my side about merging these updates with the connector
> > > > externalization.
> > > > 
> > > > Best,
> > > > F
> > > > 
> > > > --- Original Message ---
> > > > On Tuesday, October 11th, 2022 at 16:30, Martijn Visser
> > > > martijnvis...@apache.org wrote:
> > > > 
> > > > > Hi Ferenc,
> > > > > 
> > > > > Thanks for opening the discussion on this topic!
> > > > > 
> > > > > +1 for dropping HBase 1.x.
> > > > > 
> > > > > Regarding HBase 2.x, if I understand correctly it should be possible
> > > > > to
> > > > > connect to any 2.x cluster if you're using the 2.x client. Wouldn't
> > > > > it
> > > > > make
> > > > > more sense to always support the latest available version, so
> > > > > basically 2.5
> > > > > at the moment? We could always include a test to check that
> > > > > implementation
> > > > > against an older HBase version.
> > > > > 
> > > > > I also have a follow-up question already: if there's an agreement on
> > > > > this
> > > > > topic, does it make sense to directly build a new HBase connector in
> > > > > its
> > > > > own external connector repo, since I believe the current connector
> > > > > uses the
> > > > > old source/sink interfaces. We could then directly drop the ones in
> > > > > the
> > > > > Flink repo and replace it with new implementations?
> > > > > 
> > > > > Best regards,
> > > > > 
> > > > > Martijn
> > > > > 
> > > > > Op ma 10 okt. 2022 om 16:24 schreef Ferenc Csaky
> > > > >  > > > > 
> > > > > > Hi everyone,
> > > > > > 
> > > > > > Now that the connector externalization effort ig going on, I think
> > > > > > it is
> > > > > > definitely work to revisit the currently supported HBase versions
> > > > > > for the
> > > > > > Flink connector. Currently, ther is an HBase 1.4 and HBase 2.2
> > > > > > connector
> > > > > > versions, although both of those versions are kind of outdated.
> > > > > > 
> > > > > > From the HBase point of view the following can be considered [1]:
> > > > > > 
> > > > > > - HBase 1.x is dead, so on the way forward it should be safe to
> > > > > > drop
> > > > > > it.
> > > > > > - HBase 2.2 is EoL, but still used actively, we are also supporting
> > > > > > it ins
> > > > > > some of our still active releases as Cloudera.
> > > > > > - HBase 2.4 is the main thing now and probably will be supported
> > > > > > for
> > > > > > a
> > > > > > while (by us, definitely).
> > > > > > - HBase 2.5 just came out, but 2.6 is expected pretty soon, so it
> > > > > > is
> > > > > > possible it won't live long.
> > > > > > - HBase 3 is in alpha, but shooting for that probably would be
> > > > > > early
> > > > > > in
> > > > > > the near future.
> > > > > > 
> > > > > > In addition, if we are only using the standard HBase 2.x client
> > > > > > APIs, then
> > > > > > it should be possible to be compile it with any Hbase 2.x version.
> > > > > > Also,
> > > > > > any HBase 2.x cluster should be backwards compatible with all
> > > > > > earlier HBase
> > > > > > 2.x client libraries. I did not check this part thorougly but I
> > > > > > think this
> > > > > > should be true, so ideally it would be enough to have an HBase 2.4
> > > > > > connector. [2]
> > > > > > 
> > > > > > Looking forward to your opinions about this topic.
> > > > > > 
> > > > > > Best,
> > > > > > F
> > > > > > 
> > > > > > [1] https://hbase.apache.org/downloads.html
> > > > > > [2] https://hbase.apache.org/book.html#hbase.versioning.post10
> > > > > > (Client
> > > > > > API compatibility)
> > > > > 
> > > > > --
> > > > > Martijn
> > > > > https://twitter.com/MartijnVisser82
> > > > > https://github.com/MartijnVisser

Re: [DISCUSS] Updating Flink HBase Connectors

2022-10-24 Thread Ferenc Csaky

Hi,

just pinging this thread in case someone missed it and has any opinion about 
the discussed actions.

Best,
F




--- Original Message ---
On Tuesday, October 11th, 2022 at 23:29, Ferenc Csaky 
 wrote:


> 
> 
> Hi Martijn,
> 
> Thank you for your comment. About HBase 2.x, correct, that is my thought 
> process, but it has to be tested and verified.
> 
> +1 from my side about merging these updates with the connector 
> externalization.
> 
> Best,
> F
> 
> 
> --- Original Message ---
> On Tuesday, October 11th, 2022 at 16:30, Martijn Visser 
> martijnvis...@apache.org wrote:
> 
> 
> 
> > Hi Ferenc,
> > 
> > Thanks for opening the discussion on this topic!
> > 
> > +1 for dropping HBase 1.x.
> > 
> > Regarding HBase 2.x, if I understand correctly it should be possible to
> > connect to any 2.x cluster if you're using the 2.x client. Wouldn't it make
> > more sense to always support the latest available version, so basically 2.5
> > at the moment? We could always include a test to check that implementation
> > against an older HBase version.
> > 
> > I also have a follow-up question already: if there's an agreement on this
> > topic, does it make sense to directly build a new HBase connector in its
> > own external connector repo, since I believe the current connector uses the
> > old source/sink interfaces. We could then directly drop the ones in the
> > Flink repo and replace it with new implementations?
> > 
> > Best regards,
> > 
> > Martijn
> > 
> > Op ma 10 okt. 2022 om 16:24 schreef Ferenc Csaky  > 
> > > Hi everyone,
> > > 
> > > Now that the connector externalization effort ig going on, I think it is
> > > definitely work to revisit the currently supported HBase versions for the
> > > Flink connector. Currently, ther is an HBase 1.4 and HBase 2.2 connector
> > > versions, although both of those versions are kind of outdated.
> > > 
> > > From the HBase point of view the following can be considered [1]:
> > > 
> > > - HBase 1.x is dead, so on the way forward it should be safe to drop it.
> > > - HBase 2.2 is EoL, but still used actively, we are also supporting it ins
> > > some of our still active releases as Cloudera.
> > > - HBase 2.4 is the main thing now and probably will be supported for a
> > > while (by us, definitely).
> > > - HBase 2.5 just came out, but 2.6 is expected pretty soon, so it is
> > > possible it won't live long.
> > > - HBase 3 is in alpha, but shooting for that probably would be early in
> > > the near future.
> > > 
> > > In addition, if we are only using the standard HBase 2.x client APIs, then
> > > it should be possible to be compile it with any Hbase 2.x version. Also,
> > > any HBase 2.x cluster should be backwards compatible with all earlier 
> > > HBase
> > > 2.x client libraries. I did not check this part thorougly but I think this
> > > should be true, so ideally it would be enough to have an HBase 2.4
> > > connector. [2]
> > > 
> > > Looking forward to your opinions about this topic.
> > > 
> > > Best,
> > > F
> > > 
> > > [1] https://hbase.apache.org/downloads.html
> > > [2] https://hbase.apache.org/book.html#hbase.versioning.post10 (Client
> > > API compatibility)
> > 
> > --
> > Martijn
> > https://twitter.com/MartijnVisser82
> > https://github.com/MartijnVisser

[jira] [Created] (FLINK-29707) Fix possible comparator violation for "flink list"

2022-10-20 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-29707:


 Summary: Fix possible comparator violation for "flink list"
 Key: FLINK-29707
 URL: https://issues.apache.org/jira/browse/FLINK-29707
 Project: Flink
  Issue Type: Bug
  Components: Command Line Client
Affects Versions: 1.16.0
Reporter: Ferenc Csaky


For the {{list}} CLI option, the code that prints the jobs, there is a 
{{startTimeComparator}} definition, which orders the jobs and it is done this 
way:
{code:java}
Comparator startTimeComparator =
(o1, o2) -> (int) (o1.getStartTime() - o2.getStartTime());
{code}
In some rare situation this can lead to this:
{code:java}
2022-10-19 09:58:11,690 ERROR org.apache.flink.client.cli.CliFrontend   
   [] - Error while running the command.
java.lang.IllegalArgumentException: Comparison method violates its general 
contract!
at java.util.TimSort.mergeLo(TimSort.java:777) ~[?:1.8.0_312]
at java.util.TimSort.mergeAt(TimSort.java:514) ~[?:1.8.0_312]
at java.util.TimSort.mergeForceCollapse(TimSort.java:457) ~[?:1.8.0_312]
at java.util.TimSort.sort(TimSort.java:254) ~[?:1.8.0_312]
at java.util.Arrays.sort(Arrays.java:1512) ~[?:1.8.0_312]
at java.util.ArrayList.sort(ArrayList.java:1464) ~[?:1.8.0_312]
at java.util.stream.SortedOps$RefSortingSink.end(SortedOps.java:392) 
~[?:1.8.0_312]
at java.util.stream.Sink$ChainedReference.end(Sink.java:258) 
~[?:1.8.0_312]
at java.util.stream.Sink$ChainedReference.end(Sink.java:258) 
~[?:1.8.0_312]
at 
java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:363) 
~[?:1.8.0_312]
at 
java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:483) 
~[?:1.8.0_312]
at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) 
~[?:1.8.0_312]
at 
java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) 
~[?:1.8.0_312]
at 
java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
 ~[?:1.8.0_312]
at 
java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) 
~[?:1.8.0_312]
at 
java.util.stream.ReferencePipeline.forEachOrdered(ReferencePipeline.java:490) 
~[?:1.8.0_312]
at 
org.apache.flink.client.cli.CliFrontend.printJobStatusMessages(CliFrontend.java:574)
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] Externalized connector release details

2022-10-12 Thread Ferenc Csaky

+1 from my side (non-binding)

Best,
F


--- Original Message ---
On Wednesday, October 12th, 2022 at 15:47, Martijn Visser 
 wrote:


> 
> 
> +1 (binding), I am indeed assuming that Chesnay meant the last two minor
> versions as supported.
> 
> Op wo 12 okt. 2022 om 20:18 schreef Danny Cranmer dannycran...@apache.org
> 
> > Thanks for the concise summary Chesnay.
> > 
> > +1 from me (binding)
> > 
> > Just one clarification, for "3.1) The Flink versions supported by the
> > project (last 2 major Flink versions) must be supported.". Do we actually
> > mean major here, as in Flink 1.x.x and 2.x.x? Right now we would only
> > support Flink 1.15.x and not 1.14.x? I would be inclined to support the
> > latest 2 minor Flink versions (major.minor.patch) given that we only have 1
> > active major Flink version.
> > 
> > Danny
> > 
> > On Wed, Oct 12, 2022 at 2:12 PM Chesnay Schepler ches...@apache.org
> > wrote:
> > 
> > > Since the discussion
> > > (https://lists.apache.org/thread/mpzzlpob9ymkjfybm96vz2y2m5fjyvfo) has
> > > stalled a bit but we need a conclusion to move forward I'm opening a
> > > vote.
> > > 
> > > Proposal summary:
> > > 
> > > 1) Branch model
> > > 1.1) The default branch is called "main" and used for the next major
> > > iteration.
> > > 1.2) Remaining branches are called "vmajor.minor". (e.g., v3.2)
> > > 1.3) Branches are not specific to a Flink version. (i.e., no v3.2-1.15)
> > > 
> > > 2) Versioning
> > > 2.1) Source releases: major.minor.patch
> > > 2.2) Jar artifacts: major.minor.match-flink-major.flink-minor
> > > (This may imply releasing the exact same connector jar multiple times
> > > under different versions)
> > > 
> > > 3) Flink compatibility
> > > 3.1) The Flink versions supported by the project (last 2 major Flink
> > > versions) must be supported.
> > > 3.2) How this is achived is left to the connector, as long as it
> > > conforms to the rest of the proposal.
> > > 
> > > 4) Support
> > > 4.1) The last 2 major connector releases are supported with only the
> > > latter receiving additional features, with the following exceptions:
> > > 4.1.a) If the older major connector version does not support any
> > > currently supported Flink version, then it is no longer supported.
> > > 4.1.b) If the last 2 major versions do not cover all supported Flink
> > > versions, then the latest connector version that supports the older
> > > Flink version /additionally /gets patch support.
> > > 4.2) For a given major connector version only the latest minor version
> > > is supported.
> > > (This means if 1.1.x is released there will be no more 1.0.x release)
> > > 
> > > I'd like to clarify that these won't be set in stone for eternity.
> > > We should re-evaluate how well this model works over time and adjust it
> > > accordingly, consistently across all connectors.
> > > I do believe that as is this strikes a good balance between
> > > maintainability for us and clarity to users.
> > > 
> > > Voting schema:
> > > 
> > > Consensus, committers have binding votes, open for at least 72 hours.
> 
> --
> Martijn
> https://twitter.com/MartijnVisser82
> https://github.com/MartijnVisser

Re: [DISCUSS] Updating Flink HBase Connectors

2022-10-11 Thread Ferenc Csaky

Hi Martijn,

Thank you for your comment. About HBase 2.x, correct, that is my thought 
process, but it has to be tested and verified.

+1 from my side about merging these updates with the connector externalization.

Best,
F


--- Original Message ---
On Tuesday, October 11th, 2022 at 16:30, Martijn Visser 
 wrote:


> 
> 
> Hi Ferenc,
> 
> Thanks for opening the discussion on this topic!
> 
> +1 for dropping HBase 1.x.
> 
> Regarding HBase 2.x, if I understand correctly it should be possible to
> connect to any 2.x cluster if you're using the 2.x client. Wouldn't it make
> more sense to always support the latest available version, so basically 2.5
> at the moment? We could always include a test to check that implementation
> against an older HBase version.
> 
> I also have a follow-up question already: if there's an agreement on this
> topic, does it make sense to directly build a new HBase connector in its
> own external connector repo, since I believe the current connector uses the
> old source/sink interfaces. We could then directly drop the ones in the
> Flink repo and replace it with new implementations?
> 
> Best regards,
> 
> Martijn
> 
> Op ma 10 okt. 2022 om 16:24 schreef Ferenc Csaky  
> > Hi everyone,
> > 
> > Now that the connector externalization effort ig going on, I think it is
> > definitely work to revisit the currently supported HBase versions for the
> > Flink connector. Currently, ther is an HBase 1.4 and HBase 2.2 connector
> > versions, although both of those versions are kind of outdated.
> > 
> > From the HBase point of view the following can be considered [1]:
> > 
> > - HBase 1.x is dead, so on the way forward it should be safe to drop it.
> > - HBase 2.2 is EoL, but still used actively, we are also supporting it ins
> > some of our still active releases as Cloudera.
> > - HBase 2.4 is the main thing now and probably will be supported for a
> > while (by us, definitely).
> > - HBase 2.5 just came out, but 2.6 is expected pretty soon, so it is
> > possible it won't live long.
> > - HBase 3 is in alpha, but shooting for that probably would be early in
> > the near future.
> > 
> > In addition, if we are only using the standard HBase 2.x client APIs, then
> > it should be possible to be compile it with any Hbase 2.x version. Also,
> > any HBase 2.x cluster should be backwards compatible with all earlier HBase
> > 2.x client libraries. I did not check this part thorougly but I think this
> > should be true, so ideally it would be enough to have an HBase 2.4
> > connector. [2]
> > 
> > Looking forward to your opinions about this topic.
> > 
> > Best,
> > F
> > 
> > [1] https://hbase.apache.org/downloads.html
> > [2] https://hbase.apache.org/book.html#hbase.versioning.post10 (Client
> > API compatibility)
> 
> 
> --
> Martijn
> https://twitter.com/MartijnVisser82
> https://github.com/MartijnVisser

[DISCUSS] Updating Flink HBase Connectors

2022-10-10 Thread Ferenc Csaky

Hi everyone,

Now that the connector externalization effort ig going on, I think it is 
definitely work to revisit the currently supported HBase versions for the Flink 
connector. Currently, ther is an HBase 1.4 and HBase 2.2 connector versions, 
although both of those versions are kind of outdated.

From the HBase point of view the following can be considered [1]:

- HBase 1.x is dead, so on the way forward it should be safe to drop it.
- HBase 2.2 is EoL, but still used actively, we are also supporting it ins some 
of our still active releases as Cloudera.
- HBase 2.4 is the main thing now and probably will be supported for a while 
(by us, definitely).
- HBase 2.5 just came out, but 2.6 is expected pretty soon, so it is possible 
it won't live long.
- HBase 3 is in alpha, but shooting for that probably would be early in the 
near future.

In addition, if we are only using the standard HBase 2.x client APIs, then it 
should be possible to be compile it with any Hbase 2.x version. Also, any HBase 
2.x cluster should be backwards compatible with all earlier HBase 2.x client 
libraries. I did not check this part thorougly but I think this should be true, 
so ideally it would be enough to have an HBase 2.4 connector. [2]

Looking forward to your opinions about this topic.

Best,
F

[1] https://hbase.apache.org/downloads.html
[2] https://hbase.apache.org/book.html#hbase.versioning.post10 (Client API 
compatibility)

[jira] [Created] (FLINK-27441) Scrollbar is missing for particular UI elements (Accumulators, Backpressure, Watermarks)

2022-04-28 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-27441:


 Summary: Scrollbar is missing for particular UI elements 
(Accumulators, Backpressure, Watermarks)
 Key: FLINK-27441
 URL: https://issues.apache.org/jira/browse/FLINK-27441
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Affects Versions: 1.14.3, 1.15.0
Reporter: Ferenc Csaky


The angular version bump introduced a bug, where for {{nzScroll}} does not 
support percentage in CSS calc, so the scrollbar will be invisible. There is an 
easy workaround, the linked Angular discussion covers it.

Angular issue: https://github.com/NG-ZORRO/ng-zorro-antd/issues/3090



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

RE: Re: Looking for Maintainers for Flink on YARN

2022-01-28 Thread Ferenc Csaky

Hi Konstantin,

We at Cloudera will also help out with this. AFAIK there was a conversation 
about this in the past anyways. I will talk this through with the team next 
week and allocate resource accordingly.

Regards,
F

On 2022/01/26 09:17:03 Konstantin Knauf wrote:
> Hi everyone,
>
> We are seeing an increasing number of test instabilities related to YARN
> [1]. Does someone in this group have the time to pick these up? The Flink
> Confluence contains a guide on how to triage test instability tickets.
>
> Thanks,
>
> Konstantin
>
> [1]
> https://issues.apache.org/jira/browse/FLINK-25514?jql=project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20component%20%3D%20%22Deployment%20%2F%20YARN%22%20AND%20labels%20%3D%20test-stability
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/Triage+Test+Instability+Tickets
>
> On Mon, Sep 13, 2021 at 2:22 PM 柳尘  wrote:
>
> > Thanks to Konstantin for raising this question, and to Marton and Gabor
> > To strengthen!
> >
> >  If i can help
> > In order to better participate in the work, please let me know.
> >
> > the best,
> > cheng xingyuan
> >
> >
> > > 2021年7月29日 下午4:15，Konstantin Knauf  写道：
> > >
> > > Dear community,
> > >
> > > We are looking for community members, who would like to maintain Flink's
> > > YARN support going forward. So far, this has been handled by teams at
> > > Ververica & Alibaba. The focus of these teams has shifted over the past
> > > months so that we only have little time left for this topic. Still, we
> > > think, it is important to maintain high quality support for Flink on
> > YARN.
> > >
> > > What does "Maintaining Flink on YARN" mean? There are no known bigger
> > > efforts outstanding. We are mainly talking about addressing
> > > "test-stability" issues, bugs, version upgrades, community contributions
> > &
> > > smaller feature requests. The prioritization of these would be up to the
> > > future maintainers, except "test-stability" issues which are important to
> > > address for overall productivity.
> > >
> > > If a group of community members forms itself, we are happy to give an
> > > introduction to relevant pieces of the code base, principles,
> > assumptions,
> > > ... and hand over open threads.
> > >
> > > If you would like to take on this responsibility or can join this effort
> > in
> > > a supporting role, please reach out!
> > >
> > > Cheers,
> > >
> > > Konstantin
> > > for the Deployment & Coordination Team at Ververica
> > >
> > > --
> > >
> > > Konstantin Knauf
> > >
> > > https://twitter.com/snntrable
> > >
> > > https://github.com/knaufk
> >
> >
>
> --
>
> Konstantin Knauf | Head of Product
>
> +49 160 91394525
>
>
> Follow us @VervericaData Ververica 
>
>
> --
>
> Join Flink Forward  - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Karl Anton Wehner, Holger Temme, Yip Park Tung Jason,
> Jinwei (Kevin) Zhang
>

RE: Re: [DISCUSS] Future of Per-Job Mode

2022-01-28 Thread Ferenc Csaky

Hi Yang,

Thank you for the clarification. In general I think we will have time to 
experiment with this until it will be removed totally and migrate our solution 
to use application mode.

Regards,
F

On 2022/01/26 02:42:24 Yang Wang wrote:
> Hi all,
>
> I remember the application mode was initially named "cluster mode". As a
> contrast, the per-job mode is the "client mode".
> So I believe application mode should cover all the functionalities of
> per-job except where we are running the user main code.
> In the containerized or the Kubernetes world, the application mode is more
> native and easy to use since all the Flink and user
> jars are bundled in the image. I am also in favor of deprecating and
> removing the per-job in the long run.
>
>
>
> @Ferenc
> IIRC, the YARN application mode could ship user jars and dependencies via
> "yarn.ship-files" config option. The only
> limitation is that we could not ship and load the user dependencies with
> user classloader, not the parent classloader.
> FLINK-24897 is trying to fix this via supporting "usrlib" directory
> automatically.
>
>
> Best,
> Yang
>
>
>
> Ferenc Csaky  于2022年1月25日周二 22:05写道：
>
> > Hi Konstantin,
> >
> > First of all, sorry for the delay. We at Cloudera are currently relying on
> > per-job mode deploying Flink applications over YARN.
> >
> > Specifically, we allow users to upload connector jars and other artifacts.
> > There are also some default jars that we need to ship. These are all stored
> > on the local file system of our service’s node. The Flink job is submitted
> > on the users’ behalf by our service, which also specifies the jars to ship.
> > The service runs on a single node, not on all nodes with Flink TM/JM. It
> > would thus be difficult to manage the jars on every node.
> >
> > We are not familiar with the reasoning behind why application mode
> > currently doesn’t ship the user jars, besides the deployment being faster
> > this way. Would it be possible for the application mode to (optionally,
> > enabled by some config) distribute these, or are there some technical
> > limitations?
> >
> > For us it would be crucial to achieve the functionality we have at the
> > moment over YARN. We started to track
> > https://issues.apache.org/jira/browse/FLINK-24897 that Biao Geng
> > mentioned as well.
> >
> > Considering the above, for us the more soonish removal does not sound
> > really well. We can live with this feature as deprecated of course, but it
> > would be nice to have some time to figure out how we can utilize
> > Application Mode exactly and make necessary changes if required.
> >
> > Thank you,
> > F
> >
> > On 2022/01/13 08:30:48 Konstantin Knauf wrote:
> > > Hi everyone,
> > >
> > > I would like to discuss and understand if the benefits of having Per-Job
> > > Mode in Apache Flink outweigh its drawbacks.
> > >
> > >
> > > *# Background: Flink's Deployment Modes*
> > > Flink currently has three deployment modes. They differ in the following
> > > dimensions:
> > > * main() method executed on Jobmanager or Client
> > > * dependencies shipped by client or bundled with all nodes
> > > * number of jobs per cluster & relationship between job and cluster
> > > lifecycle* (supported resource providers)
> > >
> > > ## Application Mode
> > > * main() method executed on Jobmanager
> > > * dependencies already need to be available on all nodes
> > > * dedicated cluster for all jobs executed from the same main()-method
> > > (Note: applications with more than one job, currently still significant
> > > limitations like missing high-availability). Technically, a session
> > cluster
> > > dedicated to all jobs submitted from the same main() method.
> > > * supported by standalone, native kubernetes, YARN
> > >
> > > ## Session Mode
> > > * main() method executed in client
> > > * dependencies are distributed from and by the client to all nodes
> > > * cluster is shared by multiple jobs submitted from different clients,
> > > independent lifecycle
> > > * supported by standalone, Native Kubernetes, YARN
> > >
> > > ## Per-Job Mode
> > > * main() method executed in client
> > > * dependencies are distributed from and by the client to all nodes
> > > * dedicated cluster for a single job
> > > * supported by YARN only
> > >
> > >
> > > *# Reasons to Keep** There are use

RE: [DISCUSS] Future of Per-Job Mode

2022-01-25 Thread Ferenc Csaky

Hi Konstantin,

First of all, sorry for the delay. We at Cloudera are currently relying on 
per-job mode deploying Flink applications over YARN.

Specifically, we allow users to upload connector jars and other artifacts. 
There are also some default jars that we need to ship. These are all stored on 
the local file system of our service’s node. The Flink job is submitted on the 
users’ behalf by our service, which also specifies the jars to ship. The 
service runs on a single node, not on all nodes with Flink TM/JM. It would thus 
be difficult to manage the jars on every node.

We are not familiar with the reasoning behind why application mode currently 
doesn’t ship the user jars, besides the deployment being faster this way. Would 
it be possible for the application mode to (optionally, enabled by some config) 
distribute these, or are there some technical limitations?

For us it would be crucial to achieve the functionality we have at the moment 
over YARN. We started to track 
https://issues.apache.org/jira/browse/FLINK-24897 that Biao Geng mentioned as 
well.

Considering the above, for us the more soonish removal does not sound really 
well. We can live with this feature as deprecated of course, but it would be 
nice to have some time to figure out how we can utilize Application Mode 
exactly and make necessary changes if required.

Thank you,
F

On 2022/01/13 08:30:48 Konstantin Knauf wrote:
> Hi everyone,
>
> I would like to discuss and understand if the benefits of having Per-Job
> Mode in Apache Flink outweigh its drawbacks.
>
>
> *# Background: Flink's Deployment Modes*
> Flink currently has three deployment modes. They differ in the following
> dimensions:
> * main() method executed on Jobmanager or Client
> * dependencies shipped by client or bundled with all nodes
> * number of jobs per cluster & relationship between job and cluster
> lifecycle* (supported resource providers)
>
> ## Application Mode
> * main() method executed on Jobmanager
> * dependencies already need to be available on all nodes
> * dedicated cluster for all jobs executed from the same main()-method
> (Note: applications with more than one job, currently still significant
> limitations like missing high-availability). Technically, a session cluster
> dedicated to all jobs submitted from the same main() method.
> * supported by standalone, native kubernetes, YARN
>
> ## Session Mode
> * main() method executed in client
> * dependencies are distributed from and by the client to all nodes
> * cluster is shared by multiple jobs submitted from different clients,
> independent lifecycle
> * supported by standalone, Native Kubernetes, YARN
>
> ## Per-Job Mode
> * main() method executed in client
> * dependencies are distributed from and by the client to all nodes
> * dedicated cluster for a single job
> * supported by YARN only
>
>
> *# Reasons to Keep** There are use cases where you might need the
> combination of a single job per cluster, but main() method execution in the
> client. This combination is only supported by per-job mode.
> * It currently exists. Existing users will need to migrate to either
> session or application mode.
>
>
> *# Reasons to Drop** With Per-Job Mode and Application Mode we have two
> modes that for most users probably do the same thing. Specifically, for
> those users that don't care where the main() method is executed and want to
> submit a single job per cluster. Having two ways to do the same thing is
> confusing.
> * Per-Job Mode is only supported by YARN anyway. If we keep it, we should
> work towards support in Kubernetes and Standalone, too, to reduce special
> casing.
> * Dropping per-job mode would reduce complexity in the code and allow us to
> dedicate more resources to the other two deployment modes.
> * I believe with session mode and application mode we have to easily
> distinguishable and understandable deployment modes that cover Flink's use
> cases:
> * session mode: olap-style, interactive jobs/queries, short lived batch
> jobs, very small jobs, traditional cluster-centric deployment mode (fits
> the "Hadoop world")
> * application mode: long-running streaming jobs, large scale &
> heterogenous jobs (resource isolation!), application-centric deployment
> mode (fits the "Kubernetes world")
>
>
> *# Call to Action*
> * Do you use per-job mode? If so, why & would you be able to migrate to one
> of the other methods?
> * Am I missing any pros/cons?
> * Are you in favor of dropping per-job mode midterm?
>
> Cheers and thank you,
>
> Konstantin
>
> --
>
> Konstantin Knauf
>
> https://twitter.com/snntrable
>
> https://github.com/knaufk
>

91 matches

Mail list logo