[jira] [Created] (FLINK-27028) Support to upload jar and run jar in RestClusterClient

2022-04-02 Thread Aitozi (Jira)
Aitozi created FLINK-27028:
--

 Summary: Support to upload jar and run jar in RestClusterClient
 Key: FLINK-27028
 URL: https://issues.apache.org/jira/browse/FLINK-27028
 Project: Flink
  Issue Type: Improvement
  Components: Client / Job Submission
Reporter: Aitozi


The flink-kubernetes-operator is using the JarUpload + JarRun to support the 
session job management. However, currently the RestClusterClient do not expose 
a way to upload the user jar to session cluster and trigger the jar run api. So 
I used to naked RestClient to achieve this. 

Can we expose these two api the the rest cluster client to make it more 
convenient to use in the operator



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [RESULT] [VOTE] Apache Flink Kubernetes Operator Release 0.1.0, release candidate #3

2022-04-02 Thread Aitozi
Nice to see it happens

Cheers,
Aitozi.

Márton Balassi  于2022年4月2日周六 23:08写道:

> Thank you team, it was a pleasure to witness the community investing into
> this much requested topic and coming to an initial release so swiftly.
>
> On Sat, Apr 2, 2022 at 4:45 PM Gyula Fóra  wrote:
>
> > I'm happy to announce that we have unanimously approved this release.
> >
> > There are 8 approving votes, 4 of which are binding:
> > * Marton Balassi (binding)
> > * Yang Wang (non-binding)
> > * Gyula Fora (binding)
> > * Biao Geng (non-binding)
> > * Thomas Weise (binding)
> > * Xintong Song (binding)
> > * Aitozi (non-binding)
> > * Nicholas Jiang (non-binding)
> >
> > There are no disapproving votes.
> >
> > Thank you all for verifying the release candidate. I will now proceed to
> > finalize the release and announce it once everything is published.
> >
> > Cheers,
> > Gyula
> >
>


[jira] [Created] (FLINK-27027) Get rid of oddly named mvn-${sys mvn.forkNumber}.log files

2022-04-02 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-27027:


 Summary: Get rid of oddly named mvn-${sys mvn.forkNumber}.log files
 Key: FLINK-27027
 URL: https://issues.apache.org/jira/browse/FLINK-27027
 Project: Flink
  Issue Type: Technical Debt
  Components: Build System / CI
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.16.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27026) Upgrade checkstyle plugin

2022-04-02 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-27026:


 Summary: Upgrade checkstyle plugin
 Key: FLINK-27026
 URL: https://issues.apache.org/jira/browse/FLINK-27026
 Project: Flink
  Issue Type: Technical Debt
  Components: Build System
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.16.0


Newer versions of the checkstyle plugin allow running checkstyle:check without 
requiring dependency resolution. This allows it to be used in a fresh 
environment.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27025) Cannot read parquet file, after putting the jar in the right place with right permissions

2022-04-02 Thread Ziheng Wang (Jira)
Ziheng Wang created FLINK-27025:
---

 Summary: Cannot read parquet file, after putting the jar in the 
right place with right permissions
 Key: FLINK-27025
 URL: https://issues.apache.org/jira/browse/FLINK-27025
 Project: Flink
  Issue Type: Bug
  Components: API / Python, Table SQL / API
Affects Versions: 1.14.0
Reporter: Ziheng Wang


I am using Flink with the SQL API on AWS EMR. I can run queries on CSV files, 
no problem.

However when I try to run queries on Parquet files, I get this error: Caused 
by: java.io.StreamCorruptedException: unexpected block data

I have put flink-sql-parquet_2.12-1.14.0.jar under /usr/lib/flink/lib on the 
master node of the EMR cluster. Indeed it seems that Flink picks up on it, 
because if the jar is not there then the error is different (it says it can't 
understand parquet source) The jar has full 777 permissions under the same 
username as all the other jars in that file.

I tried passing a folder name as the Parquet source as well as a single Parquet 
file, nothing works. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Creating flink-kubernetes-operator project on Dockerhub

2022-04-02 Thread Gyula Fóra
Hi Devs,

Does anyone know what is the process for creating a new dockerhub project
under apache?

I would like to create *apache**/flink-kubernetes-operator *and get push
access to it.

Thank you!
Gyula


Re: [RESULT] [VOTE] Apache Flink Kubernetes Operator Release 0.1.0, release candidate #3

2022-04-02 Thread Márton Balassi
Thank you team, it was a pleasure to witness the community investing into
this much requested topic and coming to an initial release so swiftly.

On Sat, Apr 2, 2022 at 4:45 PM Gyula Fóra  wrote:

> I'm happy to announce that we have unanimously approved this release.
>
> There are 8 approving votes, 4 of which are binding:
> * Marton Balassi (binding)
> * Yang Wang (non-binding)
> * Gyula Fora (binding)
> * Biao Geng (non-binding)
> * Thomas Weise (binding)
> * Xintong Song (binding)
> * Aitozi (non-binding)
> * Nicholas Jiang (non-binding)
>
> There are no disapproving votes.
>
> Thank you all for verifying the release candidate. I will now proceed to
> finalize the release and announce it once everything is published.
>
> Cheers,
> Gyula
>


[RESULT] [VOTE] Apache Flink Kubernetes Operator Release 0.1.0, release candidate #3

2022-04-02 Thread Gyula Fóra
I'm happy to announce that we have unanimously approved this release.

There are 8 approving votes, 4 of which are binding:
* Marton Balassi (binding)
* Yang Wang (non-binding)
* Gyula Fora (binding)
* Biao Geng (non-binding)
* Thomas Weise (binding)
* Xintong Song (binding)
* Aitozi (non-binding)
* Nicholas Jiang (non-binding)

There are no disapproving votes.

Thank you all for verifying the release candidate. I will now proceed to
finalize the release and announce it once everything is published.

Cheers,
Gyula


[jira] [Created] (FLINK-27024) Cleanup surefire configuration

2022-04-02 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-27024:


 Summary: Cleanup surefire configuration
 Key: FLINK-27024
 URL: https://issues.apache.org/jira/browse/FLINK-27024
 Project: Flink
  Issue Type: Technical Debt
  Components: Build System
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.16.0


We have a few redundant surefire configurations in some connector modules, and 
overall a lot of duplication and stuff defined on the argLine which could be 
systemEnvironmentVariables (which are easier to extend in sub-modules).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Make Kubernetes Operator config "dynamic" and consider merging with flinkConfiguration

2022-04-02 Thread Gyula Fóra
Hi all,

Based on the feedback I have opened a ticket to start merging the config
settings on the operator side:
https://issues.apache.org/jira/browse/FLINK-27023

This won't allow dynamic configuration yet but it will simplify the
operator configuration as a starting point.

As a next step we can consider what config options to make dynamic and
whether to rename flinkConfiguration to configuration in the CRD.
At this point I feel that the flinkConfiguration name is actually not bad
given we also have logConfiguration.

Cheers,
Gyula

On Sat, Apr 2, 2022 at 10:00 AM Nicholas Jiang 
wrote:

> Thanks Gyula for discussing this topic! I also prefer Proposal 2 which
> merges *flinkConfiguration* and *operatorConfiguration* for easily
> understanding to end Flink users. IMO, from an end-user perspective, the
> *flinkConfiguration* and *operatorConfiguration* are the configuration
> related to the Flink deployment or job, that there is no need to
> distinguish and let users configure separately.
>
> Best,
> Nicholas Jiang
>
> On 2022/04/01 18:25:14 Gyula Fóra wrote:
> > Hi Devs!
> >
> > *Background*:
> > With more and more features and options added to the flink kubernetes
> > operator it would make sense to not expose everything as first class
> > options in the deployment/jobspec (same as we do for flink configuration
> > currently).
> >
> > Furthermore it would be beneficial if users could control reconciliation
> > specific settings like timeouts, reschedule delays etc on a per
> deployment
> > basis.
> >
> >
> > *Proposal 1*The more conservative proposal would be to add a new
> > *operatorConfiguration* field to the deployment spec that the operator
> > would use during the controller loop (merged with the default operator
> > config). This makes the operator very extensible with new options and
> would
> > also allow overrides to the default operator config on a per deployment
> > basis.
> >
> >
> > *Proposal 2*I would actually go one step further and propose that we
> should
> > merge *flinkConfiguration* and *operatorConfiguration* -as whether
> > something affects the flink job submission/job or the operator behaviour
> > does not really make a difference to the end user. For users the operator
> > is part of flink so having a multiple configuration maps could simply
> cause
> > confusion.
> > We could simply prefix all operator related configs with
> > `kubernetes.operator` to ensure that we do not accidentally conflict with
> > flink native config options.
> > If we go this route I would even go as far as to naming it simply
> > *configuration* for sake of simplicity.
> >
> > I personally would go with proposal 2 to make this as simple as possible
> > for the users.
> >
> > Please let me know what you think!
> > Gyula
> >
>


[jira] [Created] (FLINK-27023) Merge default flink and operator configuration settings for the operator

2022-04-02 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-27023:
--

 Summary: Merge default flink and operator configuration settings 
for the operator
 Key: FLINK-27023
 URL: https://issues.apache.org/jira/browse/FLINK-27023
 Project: Flink
  Issue Type: New Feature
  Components: Kubernetes Operator
Reporter: Gyula Fora
 Fix For: kubernetes-operator-1.0.0


Based on the mailing list discussion : 
[https://lists.apache.org/thread/pnf2gk9dgqv3qrtszqbfcdxf32t2gr3x]

As a first step we can combine the operators default flink and operator config.

This includes the following changes:
 # Get rid of the DefaultConfig class and replace with a single Configuration 
object containing the settings for both.
 # Rename OperatorConfigOptions -> KubernetesOperatorConfigOptions
 # Prefix all options with `kubernetes` to get kubernetes.operator.
 # In the helm chart combine the operatorConfiguration and 
flinkDefaultConfiguration into a common defaultConfigurationSection. We should 
still keep the logging settings separately for the two somehow



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27022) Expose taskSlots in TaskManager spec

2022-04-02 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-27022:
--

 Summary: Expose taskSlots in TaskManager spec
 Key: FLINK-27022
 URL: https://issues.apache.org/jira/browse/FLINK-27022
 Project: Flink
  Issue Type: Sub-task
  Components: Kubernetes Operator
Reporter: Gyula Fora


Basically every Flink job needs the number of taskslots configuration and it is 
an important part of the resource spec.

We should include this in the taskmanager spec as an integer parameter.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27021) CRD improvements for v1alpha2

2022-04-02 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-27021:
--

 Summary: CRD improvements for v1alpha2
 Key: FLINK-27021
 URL: https://issues.apache.org/jira/browse/FLINK-27021
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Gyula Fora
 Fix For: kubernetes-operator-1.0.0


Umbrella Jira to track CRD improvements and changes for the next release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27020) use hive dialect in SqlClient would thrown an error based on 1.15 version

2022-04-02 Thread Jing Zhang (Jira)
Jing Zhang created FLINK-27020:
--

 Summary: use hive dialect in SqlClient would thrown an error based 
on 1.15 version
 Key: FLINK-27020
 URL: https://issues.apache.org/jira/browse/FLINK-27020
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.15.0
Reporter: Jing Zhang
 Attachments: image-2022-04-02-20-28-01-335.png

I use 1.15 rc0 and encounter a problem.
An error would be thrown out if I use hive dialect in SqlClient.
 !image-2022-04-02-20-28-01-335.png! 
And I already add flink-sql-connector-hive-2.3.6_2.12-1.15-SNAPSHOT.jar.
I note that, load and use hive module could work fine, but use hive dialect 
would fail.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27019) use hive dialect in SqlClient would thrown an error based on 1.15 version

2022-04-02 Thread Jing Zhang (Jira)
Jing Zhang created FLINK-27019:
--

 Summary: use hive dialect in SqlClient would thrown an error based 
on 1.15 version
 Key: FLINK-27019
 URL: https://issues.apache.org/jira/browse/FLINK-27019
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.15.0
Reporter: Jing Zhang
 Attachments: image-2022-04-02-20-25-25-169.png

I use 1.15 rc0 and encounter a problem.
An error would be thrown out if I use hive dialect in SqlClient.
 !image-2022-04-02-20-25-25-169.png! 

And I already add flink-sql-connector-hive-2.3.6_2.12-1.15-SNAPSHOT.jar.
I note that, load and use hive module could work fine, but use hive dialect 
would fail.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27018) timestamp missing end zero when outputing to kafka

2022-04-02 Thread jeff-zou (Jira)
jeff-zou created FLINK-27018:


 Summary: timestamp missing end  zero when outputing to kafka
 Key: FLINK-27018
 URL: https://issues.apache.org/jira/browse/FLINK-27018
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.13.5
Reporter: jeff-zou


the bug is described as follows:

 
{code:java}
data in source:
 2022-04-02 03:34:21.260
but after sink by sql, data in kafka:
 2022-04-02 03:34:21.26
{code}
 

data miss end zero in kafka.

 

sql:
{code:java}
create kafka_table(stime stimestamp) with ('connector'='kafka','format' = 
'json');
insert into kafka_table select stime from (values(timestamp '2022-04-02 
03:34:21.260')){code}
the value in kafka is : \{"stime":"2022-04-02 03:34:21.26"}, missed end zero.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27017) Hive dialect supports divide by zero

2022-04-02 Thread luoyuxia (Jira)
luoyuxia created FLINK-27017:


 Summary: Hive dialect supports divide by zero
 Key: FLINK-27017
 URL: https://issues.apache.org/jira/browse/FLINK-27017
 Project: Flink
  Issue Type: Sub-task
Reporter: luoyuxia






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27016) Improve supporting for complex data type for Hive dialect

2022-04-02 Thread luoyuxia (Jira)
luoyuxia created FLINK-27016:


 Summary: Improve supporting for complex data type for Hive dialect
 Key: FLINK-27016
 URL: https://issues.apache.org/jira/browse/FLINK-27016
 Project: Flink
  Issue Type: Sub-task
Reporter: luoyuxia






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27015) Hive dialect supports cast timestamp to decimal implicitly while inserting

2022-04-02 Thread luoyuxia (Jira)
luoyuxia created FLINK-27015:


 Summary: Hive dialect supports cast timestamp to decimal 
implicitly while inserting
 Key: FLINK-27015
 URL: https://issues.apache.org/jira/browse/FLINK-27015
 Project: Flink
  Issue Type: Bug
Reporter: luoyuxia






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27014) Hive dialect support select null literal in subquery

2022-04-02 Thread luoyuxia (Jira)
luoyuxia created FLINK-27014:


 Summary: Hive dialect support select null literal in subquery
 Key: FLINK-27014
 URL: https://issues.apache.org/jira/browse/FLINK-27014
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Affects Versions: 1.16.0
Reporter: luoyuxia


It'll throw the exception "Unsupported type 'NULL' to get internal serializer" 
with the following sql in Hive dialect

 
{code:java}
select count(key) from (select null as key from src)src {code}
The reason is null literal will be considered as NULL type, but there's no 
serializer for NULL type.  When meet such case, consider it as Varchar Type.

 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27013) Hive dialect supports IS_DISTINCT_FROM

2022-04-02 Thread luoyuxia (Jira)
luoyuxia created FLINK-27013:


 Summary: Hive dialect supports IS_DISTINCT_FROM
 Key: FLINK-27013
 URL: https://issues.apache.org/jira/browse/FLINK-27013
 Project: Flink
  Issue Type: Improvement
Reporter: luoyuxia


It'll throw the exception with error message "Unsupported call: IS DISTINCT 
FROM(STRING, STRING) " with the following SQL in Hive dialect:

 
{code:java}
create table test(x string, y string);
select x <=> y, (x <=> y) = false from test; {code}
 

 

And I found  the IS_NOT_DISTINCT_FROM is supported in ExprCodeGenerator.scala, 
but IS_ DISTINCT_FROM is not.  The IS_ DISTINCT_FROM should also be implemented 
in ExprCodeGenerator.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27012) Caching maven dependencies to speed up workflows for flink-kubernetes-operator

2022-04-02 Thread Yang Wang (Jira)
Yang Wang created FLINK-27012:
-

 Summary: Caching maven dependencies to speed up workflows for 
flink-kubernetes-operator
 Key: FLINK-27012
 URL: https://issues.apache.org/jira/browse/FLINK-27012
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Yang Wang


The current CI downloads the maven dependencies every time. This could be saved 
by using github actions cache.

 
{code:java}
[INFO] Scanning for projects...
15Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/apache/23/apache-23.pom
16Progress (1): 2.7/18 kB
17Progress (1): 5.5/18 kB
18Progress (1): 8.2/18 kB
19Progress (1): 11/18 kB 
20Progress (1): 14/18 kB
21Progress (1): 16/18 kB
22Progress (1): 18 kB   
23   
24Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/apache/23/apache-23.pom (18 kB 
at 55 kB/s)
25[INFO] 

26[INFO] Reactor Build Order:
27[INFO] 
28[INFO] Flink Kubernetes:  
[pom]
29[INFO] Flink Kubernetes Shaded
[jar]
30[INFO] Flink Kubernetes Operator  
[jar]
31[INFO] Flink Kubernetes Webhook   
[jar]
32[INFO] 
33[INFO] -< org.apache.flink:flink-kubernetes-operator-parent 
>--
34[INFO] Building Flink Kubernetes: 1.0-SNAPSHOT
[1/4]
35[INFO] [ pom 
]-
36Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-remote-resources-plugin/1.7.0/maven-remote-resources-plugin-1.7.0.pom
37Progress (1): 2.7/13 kB
38Progress (1): 5.5/13 kB
39Progress (1): 8.2/13 kB
40Progress (1): 11/13 kB 
41Progress (1): 13 kB   
42   
43Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-remote-resources-plugin/1.7.0/maven-remote-resources-plugin-1.7.0.pom
 (13 kB at 990 kB/s)
44Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-plugins/33/maven-plugins-33.pom
45Progress (1): 2.7/11 kB
46Progress (1): 5.5/11 kB
47Progress (1): 8.2/11 kB
48Progress (1): 11 kB
49   
50Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-plugins/33/maven-plugins-33.pom
 (11 kB at 763 kB/s)
51Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-parent/33/maven-parent-33.pom
 {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27011) Add various types of state usage examples

2022-04-02 Thread zhangjingcun (Jira)
zhangjingcun created FLINK-27011:


 Summary: Add various types of state usage examples
 Key: FLINK-27011
 URL: https://issues.apache.org/jira/browse/FLINK-27011
 Project: Flink
  Issue Type: New Feature
  Components: Examples
Affects Versions: 1.14.4
Reporter: zhangjingcun


Add various types of state usage examples



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27010) Support setting sql client args via flink conf

2022-04-02 Thread Luning Wang (Jira)
Luning Wang created FLINK-27010:
---

 Summary: Support setting sql client args via flink conf
 Key: FLINK-27010
 URL: https://issues.apache.org/jira/browse/FLINK-27010
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Client
Affects Versions: 1.14.4
Reporter: Luning Wang


{{'-i' '-j' and '-l' only be set in }}startup options. 

I want to add the following options in flink-conf.yaml to set SQL Client 
options.
{code:java}
sql-client.execution.init-file: /foo/foo.sql
sql-client.execution.jar: foo.jar
sql-client.execution.library: /foo{code}
{{}}

 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27009) Support SQL job submission in flink kubernetes opeartor

2022-04-02 Thread Biao Geng (Jira)
Biao Geng created FLINK-27009:
-

 Summary: Support SQL job submission in flink kubernetes opeartor
 Key: FLINK-27009
 URL: https://issues.apache.org/jira/browse/FLINK-27009
 Project: Flink
  Issue Type: New Feature
  Components: Kubernetes Operator
Reporter: Biao Geng


Currently, the flink kubernetes opeartor is for jar job using application or 
session cluster. For SQL job, there is no out of box solution in the operator.  
One simple and short-term solution is to wrap the SQL script into a jar job 
using table API with limitation.
The long-term solution may work with 
[FLINK-26541|https://issues.apache.org/jira/browse/FLINK-26541] to achieve the 
full support.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27008) Document the configurable parameters of the helm chart and their default values

2022-04-02 Thread Yang Wang (Jira)
Yang Wang created FLINK-27008:
-

 Summary: Document the configurable parameters of the helm chart 
and their default values
 Key: FLINK-27008
 URL: https://issues.apache.org/jira/browse/FLINK-27008
 Project: Flink
  Issue Type: Sub-task
  Components: Kubernetes Operator
Reporter: Yang Wang


We might need a table to document all the configurable parameters of the helm 
chart and their default values. It could be put in the helm section[1].
||Parameters||Description||Default Value||
|watchNamespaces|List of kubernetes namespaces to watch for FlinkDeployment 
changes, empty means all namespaces.| |
|image.repository|The image repository of 
flink-kubernetes-operator|ghcr.io/apache/flink-kubernetes-operator|
|...|...| |

[1]. 
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/helm/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27007) Should the MailboxDefaultAction interface be annotated with @FunctionalInterface?

2022-04-02 Thread Shubin Ruan (Jira)
Shubin Ruan created FLINK-27007:
---

 Summary: Should the MailboxDefaultAction interface be annotated 
with @FunctionalInterface?
 Key: FLINK-27007
 URL: https://issues.apache.org/jira/browse/FLINK-27007
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Task
Reporter: Shubin Ruan


When StreamTask initializes mailboxProcessor, it passes in the 
MailboxDefaultAction parameter through \{this::processInput}.
{code:java}
this.mailboxProcessor =
new MailboxProcessor(
this::processInput, mailbox, actionExecutor, 
numMailsProcessedCounter);{code}
Since the parameter can be passed by Lambda expression, it means that 
MailboxDefaultAction is a functional interface, that is, there is only one 
unimplemented method. To increase code readability, should the 
MailboxDefaultAction interface be annotated with @FunctionalInterface? should 
the MailboxDefaultAction interface be annotated with @FunctionalInterface?
{code:java}
@Internal
@FunctionalInterface
public interface MailboxDefaultAction {
   ...
} {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27006) Support SingleValueAggFunction for Char data type

2022-04-02 Thread luoyuxia (Jira)
luoyuxia created FLINK-27006:


 Summary: Support SingleValueAggFunction for Char data type
 Key: FLINK-27006
 URL: https://issues.apache.org/jira/browse/FLINK-27006
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Reporter: luoyuxia
 Fix For: 1.16.0


The exception happens when I try to running the following sql with Hive dialect:
{code:java}
create table tempty(c char(2));
select * from tempty where c = (select * from tempty) ;{code}
the exception is "SINGLE_VALUE aggregate function doesn't support type 'CHAR'."

There missing the CharType in SingleValueAggFunction, which should be supported.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Make Kubernetes Operator config "dynamic" and consider merging with flinkConfiguration

2022-04-02 Thread Nicholas Jiang
Thanks Gyula for discussing this topic! I also prefer Proposal 2 which merges 
*flinkConfiguration* and *operatorConfiguration* for easily understanding to 
end Flink users. IMO, from an end-user perspective, the *flinkConfiguration* 
and *operatorConfiguration* are the configuration related to the Flink 
deployment or job, that there is no need to distinguish and let users configure 
separately. 

Best,
Nicholas Jiang

On 2022/04/01 18:25:14 Gyula Fóra wrote:
> Hi Devs!
> 
> *Background*:
> With more and more features and options added to the flink kubernetes
> operator it would make sense to not expose everything as first class
> options in the deployment/jobspec (same as we do for flink configuration
> currently).
> 
> Furthermore it would be beneficial if users could control reconciliation
> specific settings like timeouts, reschedule delays etc on a per deployment
> basis.
> 
> 
> *Proposal 1*The more conservative proposal would be to add a new
> *operatorConfiguration* field to the deployment spec that the operator
> would use during the controller loop (merged with the default operator
> config). This makes the operator very extensible with new options and would
> also allow overrides to the default operator config on a per deployment
> basis.
> 
> 
> *Proposal 2*I would actually go one step further and propose that we should
> merge *flinkConfiguration* and *operatorConfiguration* -as whether
> something affects the flink job submission/job or the operator behaviour
> does not really make a difference to the end user. For users the operator
> is part of flink so having a multiple configuration maps could simply cause
> confusion.
> We could simply prefix all operator related configs with
> `kubernetes.operator` to ensure that we do not accidentally conflict with
> flink native config options.
> If we go this route I would even go as far as to naming it simply
> *configuration* for sake of simplicity.
> 
> I personally would go with proposal 2 to make this as simple as possible
> for the users.
> 
> Please let me know what you think!
> Gyula
> 


[jira] [Created] (FLINK-27005) Bump CRD version to v1alpha2

2022-04-02 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-27005:
--

 Summary: Bump CRD version to v1alpha2
 Key: FLINK-27005
 URL: https://issues.apache.org/jira/browse/FLINK-27005
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Gyula Fora


We should upgrade the CRD version to v1alpha2 for both FlinkDeployment and 
FlinkSessionJob to avoid any conflicts with the preview release.

We should also upgrade the version in all examples and documentation that 
references it.

We can also consider introducing some tooling to make this easier as we will 
have to repeate this step at least a few more times.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


RE: Question about community collaboration options

2022-04-02 Thread Hao t Chang
Hi Martijn,

Thanks for the meeting pointer.  I am interested to join the next sync meeting 
on Tuesday.
I am curious besides using the PR threads in github , how do you "ping" people 
to get attention ?

On 4/1/22, 3:07 AM, "Martijn Visser"  wrote:

Hi Ted,

This is a great question. There are usually bi-weekly sync meetings to
discuss plans and progress for the next Flink release. For example, there
was a regular one for the Flink 1.15 release [1]

I do see some things that we could improve on as a Flink community. For
example, there are quite a large number of open PRs. PRs don't always get a
review or get merged. I'm hearing from other contributors that they need to
ping people in order to get attention for their PR. Some PRs are of poor
quality, because they don't adhere to the code contribution guide. I can
imagine that newly interested contributors don't know exactly where to
start.

I see other open source projects who indeed run a public Slack channel,
like Apache Airflow [2]. Kubernetes actually has a triage team and
processes documented [3].

I'm curious what other Users and Dev'ers think. What are some of the
problems that you're currently experiencing and how do you think we could
improve/solve those?

Best regards,

Martijn Visser
https://twitter.com/MartijnVisser82 
https://github.com/MartijnVisser 

[1] https://cwiki.apache.org/confluence/display/FLINK/1.15+Release 
[2] https://airflow.apache.org/community/  (Under "Ask a question")
[3]

https://github.com/kubernetes/community/blob/master/sig-contributor-experience/triage-team/triage.md
 

On Thu, 31 Mar 2022 at 23:44, Hao t Chang  wrote:

> Hi,
>
>
>
> I have been looking into Flink and joined the mailing lists recently. I am
> trying to figure out how the community members collaborate. For example, 
is
> there Slack channels or Weekly sync up calls where the community members
> can participate and talk with each other to brainstorm, design, and make
> decisions?
>
>
>
> Ted
>



[jira] [Created] (FLINK-27004) Use '.' instead of 'file..' in table properties of managed table

2022-04-02 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-27004:
---

 Summary: Use '.' instead of 
'file..' in table properties of managed table
 Key: FLINK-27004
 URL: https://issues.apache.org/jira/browse/FLINK-27004
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Affects Versions: 0.1.0
Reporter: Caizhi Weng
 Fix For: 0.1.0


Currently if we want to set compression method of file store we need to use 
'file.orc.compress'. It would be better to use 'orc.compress' directly, just 
like what we do for filesystem connectors.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-27003) Operator Helm chart improvements

2022-04-02 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-27003:
--

 Summary: Operator Helm chart improvements
 Key: FLINK-27003
 URL: https://issues.apache.org/jira/browse/FLINK-27003
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Gyula Fora


Umbrella ticket for helm related improvements for the next release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [VOTE] Apache Flink Kubernetes Operator Release 0.1.0, release candidate #3

2022-04-02 Thread Nicholas Jiang
+1 (non-binding)

1.Verified maven and helm chart versions for built from source
2.Verified helm chart points to correct docker image and deploys it by
default
3.Verified helm installation and
basic/checkpointing, stateful examples with upgrades and manual savepoints
4.Verified online documents including Quick Start etc.

Best,
Nicholas Jiang

On 2022/03/31 08:53:40 Yang Wang wrote:
> +1 (non-binding)
> 
> Verified via the following steps:
> 
> * Verify checksums and GPG signatures
> * Verify that the source distributions do not contain any binaries
> * Build source distribution successfully
> * Verify all the POM version is 0.1.0
> 
> * License check, the jars bundled in docker image and maven artifacts have
> correct NOTICE and licenses
> 
> # Functionality verification
> * Install flink-kubernetes-operator via helm
> - helm repo add flink-kubernetes-operator-0.1.0-rc3
> https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-0.1.0-rc3
> - helm install flink-kubernetes-operator
> flink-kubernetes-operator-0.1.0-rc3/flink-kubernetes-operator
> 
> * Apply a new FlinkDeployment CR with HA and ingress enabled, Flink webUI
> normal
> - kubectl apply -f
> https://raw.githubusercontent.com/apache/flink-kubernetes-operator/release-0.1/e2e-tests/data/cr.yaml
> 
> * Upgrade FlinkDeployment, new job parallelism takes effect and recover
> from latest checkpoint
> - kubectl patch flinkdep flink-example-statemachine --type merge
> --patch '{"spec":{"job": {"parallelism": 1 } } }'
> 
> * Verify manual savepoint trigger
> - kubectl patch flinkdep flink-example-statemachine --type merge
> --patch '{"spec":{"job": {"savepointTriggerNonce": 1 } } }'
> 
> * Suspend a FlinkDeployment
> - kubectl patch flinkdep flink-example-statemachine --type merge
> --patch '{"spec":{"job": {"state": "suspended" } } }'
> 
> 
> Best,
> Yang
> 
> Márton Balassi  于2022年3月31日周四 01:01写道:
> 
> > +1 (binding)
> >
> > Verified the following:
> >
> >- shasums
> >- gpg signatures
> >- source does not contain any binaries
> >- built from source
> >- deployed via helm after adding the distribution webserver endpoint as
> >a helm registry
> >- all relevant files have license headers
> >
> >
> > On Wed, Mar 30, 2022 at 4:39 PM Gyula Fóra  wrote:
> >
> > > Hi everyone,
> > >
> > > Please review and vote on the release candidate #3 for the version 0.1.0
> > of
> > > Apache Flink Kubernetes Operator,
> > > as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > > **Release Overview**
> > >
> > > As an overview, the release consists of the following:
> > > a) Kubernetes Operator canonical source distribution (including the
> > > Dockerfile), to be deployed to the release repository at dist.apache.org
> > > b) Kubernetes Operator Helm Chart to be deployed to the release
> > repository
> > > at dist.apache.org
> > > c) Maven artifacts to be deployed to the Maven Central Repository
> > > d) Docker image to be pushed to dockerhub
> > >
> > > **Staging Areas to Review**
> > >
> > > The staging areas containing the above mentioned artifacts are as
> > follows,
> > > for your review:
> > > * All artifacts for a,b) can be found in the corresponding dev repository
> > > at dist.apache.org [1]
> > > * All artifacts for c) can be found at the Apache Nexus Repository [2]
> > > * The docker image is staged on github [7]
> > >
> > > All artifacts are signed with the key
> > > 0B4A34ADDFFA2BB54EB720B221F06303B87DAFF1 [3]
> > >
> > > Other links for your review:
> > > * JIRA release notes [4]
> > > * source code tag "release-0.1.0-rc3" [5]
> > > * PR to update the website Downloads page to include Kubernetes Operator
> > > links [6]
> > >
> > > **Vote Duration**
> > >
> > > The voting time will run for at least 72 hours.
> > > It is adopted by majority approval, with at least 3 PMC affirmative
> > votes.
> > >
> > > **Note on Verification**
> > >
> > > You can follow the basic verification guide here
> > > <
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Kubernetes+Operator+Release
> > > >
> > > .
> > > Note that you don't need to verify everything yourself, but please make
> > > note of what you have tested together with your +- vote.
> > >
> > > Thanks,
> > > Gyula
> > >
> > > [1]
> > >
> > >
> > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-0.1.0-rc3/
> > > [2]
> > > https://repository.apache.org/content/repositories/orgapacheflink-1492/
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [4]
> > >
> > >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12351499
> > > [5]
> > >
> > https://github.com/apache/flink-kubernetes-operator/tree/release-0.1.0-rc3
> > > [6] https://github.com/apache/flink-web/pull/519
> > > [7] ghcr.io/apache/flink-kubernetes-operator:2c166e3
> > >
> >
> 


Re: [DISCUSS] Make Kubernetes Operator config "dynamic" and consider merging with flinkConfiguration

2022-04-02 Thread Gyula Fóra
That's a very good point Matyas, we cannot risk any interference with other
jobs but I think we don't necessarily have to.

First of all we should only allow users to overwrite selected configs. For
deciding what to allow, we can separate the operator related configs into 2
main groups:

*Group 1*: Config options that are specific to the reconciliation logic of
a specific job such as feature flags etc (for example
https://issues.apache.org/jira/browse/FLINK-26926).
These configs cannot possibly cause interference, they are part of the
natural reconciliation logic.

*Group 2: *Config options that actually affect the controller scheduling,
memory/cpu requirements. These are the problematic ones as they can
actually break the operator if we are not careful.

For Group 1 there are no safeguards necessary and I would say this is the
primary use-case I wanted to cover with this discussion.

I think Group 2 could also be supported as long as we specifically validate
the values that for example scheduling delays are within pre-configured
bounds. One example would be configuring client timeouts, there could be
special cases where the operator hardcoded timeout is not good enough, but
we also want to set a hard max bound on the configurable value.

Cheers,
Gyula



On Sat, Apr 2, 2022 at 8:57 AM Őrhidi Mátyás 
wrote:

> Thanks Gyula for bringing this topic up! Although the suggestion would
> indeed simplify the configuration handling I have some concerns about
> opening the operator configuration for end users in certain cases. In a
> multitenant scenario for example, how could we protect against one user
> messing up the configs and potentially distract others? As I see it, the
> operator acts as the control plane, ideally totally transparent for end
> users, often behind a rest API. Let me know what you think.
>
> Cheers,
> Matyas
>
> On Sat, Apr 2, 2022 at 5:12 AM Yang Wang  wrote:
>
> > I also like the proposal 2. Maybe it could be named with
> > *KubernetesOperatorConfigOptions*, which just looks like all other
> > ConfigOption(e.g. *KubernetesConfigOptions, YarnConfigOptions*) in Flink.
> > The proposal 2 is more natural and easy to use for Flink users.
> >
> >
> > Best,
> > Yang
> >
> > Gyula Fóra  于2022年4月2日周六 02:25写道:
> >
> >> Hi Devs!
> >>
> >> *Background*:
> >> With more and more features and options added to the flink kubernetes
> >> operator it would make sense to not expose everything as first class
> >> options in the deployment/jobspec (same as we do for flink configuration
> >> currently).
> >>
> >> Furthermore it would be beneficial if users could control reconciliation
> >> specific settings like timeouts, reschedule delays etc on a per
> deployment
> >> basis.
> >>
> >>
> >> *Proposal 1*The more conservative proposal would be to add a new
> >> *operatorConfiguration* field to the deployment spec that the operator
> >> would use during the controller loop (merged with the default operator
> >> config). This makes the operator very extensible with new options and
> >> would
> >> also allow overrides to the default operator config on a per deployment
> >> basis.
> >>
> >>
> >> *Proposal 2*I would actually go one step further and propose that we
> >> should
> >> merge *flinkConfiguration* and *operatorConfiguration* -as whether
> >> something affects the flink job submission/job or the operator behaviour
> >> does not really make a difference to the end user. For users the
> operator
> >> is part of flink so having a multiple configuration maps could simply
> >> cause
> >> confusion.
> >> We could simply prefix all operator related configs with
> >> `kubernetes.operator` to ensure that we do not accidentally conflict
> with
> >> flink native config options.
> >> If we go this route I would even go as far as to naming it simply
> >> *configuration* for sake of simplicity.
> >>
> >> I personally would go with proposal 2 to make this as simple as possible
> >> for the users.
> >>
> >> Please let me know what you think!
> >> Gyula
> >>
> >
>


Re: [DISCUSS] Make Kubernetes Operator config "dynamic" and consider merging with flinkConfiguration

2022-04-02 Thread Őrhidi Mátyás
Thanks Gyula for bringing this topic up! Although the suggestion would
indeed simplify the configuration handling I have some concerns about
opening the operator configuration for end users in certain cases. In a
multitenant scenario for example, how could we protect against one user
messing up the configs and potentially distract others? As I see it, the
operator acts as the control plane, ideally totally transparent for end
users, often behind a rest API. Let me know what you think.

Cheers,
Matyas

On Sat, Apr 2, 2022 at 5:12 AM Yang Wang  wrote:

> I also like the proposal 2. Maybe it could be named with
> *KubernetesOperatorConfigOptions*, which just looks like all other
> ConfigOption(e.g. *KubernetesConfigOptions, YarnConfigOptions*) in Flink.
> The proposal 2 is more natural and easy to use for Flink users.
>
>
> Best,
> Yang
>
> Gyula Fóra  于2022年4月2日周六 02:25写道:
>
>> Hi Devs!
>>
>> *Background*:
>> With more and more features and options added to the flink kubernetes
>> operator it would make sense to not expose everything as first class
>> options in the deployment/jobspec (same as we do for flink configuration
>> currently).
>>
>> Furthermore it would be beneficial if users could control reconciliation
>> specific settings like timeouts, reschedule delays etc on a per deployment
>> basis.
>>
>>
>> *Proposal 1*The more conservative proposal would be to add a new
>> *operatorConfiguration* field to the deployment spec that the operator
>> would use during the controller loop (merged with the default operator
>> config). This makes the operator very extensible with new options and
>> would
>> also allow overrides to the default operator config on a per deployment
>> basis.
>>
>>
>> *Proposal 2*I would actually go one step further and propose that we
>> should
>> merge *flinkConfiguration* and *operatorConfiguration* -as whether
>> something affects the flink job submission/job or the operator behaviour
>> does not really make a difference to the end user. For users the operator
>> is part of flink so having a multiple configuration maps could simply
>> cause
>> confusion.
>> We could simply prefix all operator related configs with
>> `kubernetes.operator` to ensure that we do not accidentally conflict with
>> flink native config options.
>> If we go this route I would even go as far as to naming it simply
>> *configuration* for sake of simplicity.
>>
>> I personally would go with proposal 2 to make this as simple as possible
>> for the users.
>>
>> Please let me know what you think!
>> Gyula
>>
>


[jira] [Created] (FLINK-27002) Optimize batch multiple partitions inserting

2022-04-02 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-27002:


 Summary: Optimize batch multiple partitions inserting
 Key: FLINK-27002
 URL: https://issues.apache.org/jira/browse/FLINK-27002
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.1.0


We can implement `SupportsPartitioning.requiresPartitionGrouping`. Write 
table_store after the planner is ordered by partition to avoid OOM caused by 
writing too many partitions at the same time.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


回复: [GitHub] [flink] flinkbot edited a comment on pull request #15519: [FLINK-22120] Remove duplicate code in generated code for map get

2022-04-02 Thread 范 佳兴
退订

发件人: GitBox 
发送时间: 2021年4月9日 16:28
收件人: iss...@flink.apache.org 
主题: [GitHub] [flink] flinkbot edited a comment on pull request #15519: 
[FLINK-22120] Remove duplicate code in generated code for map get


flinkbot edited a comment on pull request #15519:
URL: https://github.com/apache/flink/pull/15519#issuecomment-815409179


   
   ## CI report:

   * f4c660ad4e60a753513f4759dee39363fa86ebeb Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=16292)

   
   Bot commands
 The @flinkbot bot supports the following commands:

- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org