[jira] [Commented] (FLINK-25234) Flink should parse ISO timestamp in UTC format

2024-05-09 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845049#comment-17845049
 ] 

Eric Xiao commented on FLINK-25234:
---

We just hit similar errors when trying to consume JSON data with our timestamp 
format set to `json.timestamp-format.standard=ISO-8601` and the timestamp data 
to be of SQL format (and vice-versa). We have found this to happen on many 
occasions where their is a mismatch between `json.timestamp-format.standard` 
and the timestamp data in the JSON, which has led to some friction and 
confusion from our users with regards to what is wrong, since it's not too 
clear from the error message.

Is there a way we can automate the selection of timestamp formats? i.e. would 
it make sense/be a good feature to have a way to tell Flink to parse a 
timestamp with all supported formats? i.e. 
`json.timestamp-format.parse-style=\{strict, lenient}` where strict would be 
the default to ensure backwards compatibility?

Would love some feedback from the community / [~twalthr], I think we can 
definitely reduce some friction here for the users.

> Flink should parse ISO timestamp in UTC format
> --
>
> Key: FLINK-25234
> URL: https://issues.apache.org/jira/browse/FLINK-25234
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / API
>Affects Versions: 1.14.0
>Reporter: Egor Ryashin
>Priority: Major
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Error parsing timestamp with ISO-8601 format:
> {code:java}
> [ERROR] Could not execute SQL statement. Reason:
> java.time.format.DateTimeParseException: Text '2021-12-08T12:59:57.028Z' 
> could not be parsed, unparsed text found at index 23 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31941) Not backwards compatible naming for some kubernetes resources

2023-04-25 Thread Eric Xiao (Jira)
Eric Xiao created FLINK-31941:
-

 Summary: Not backwards compatible naming for some kubernetes 
resources
 Key: FLINK-31941
 URL: https://issues.apache.org/jira/browse/FLINK-31941
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Reporter: Eric Xiao


We are in the process of migrating all our workloads over to the Kubernetes 
operator and noticed that some of the Kubernetes resources in the operator are 
hardcoded and not consistent with how Flink previously defined them. This is 
leading to us some downstream incompatibilities and some migration toil in our 
monitoring and dashboards that have queries depending on the previous naming 
schema.

I couldn't find exact definitions a task-manager or job-manager in the flink 
repo, but this is what I have noticed, I may be wrong on my interpretations .
h3. Deployment Names

Previously:

 
{code:java}
NAME   READY   UP-TO-DATE   AVAILABLE   AGE
trickle-job-manager2/2 22   13d
trickle-task-manager   10/10   10   10  13d {code}
New (Flink Operator):

 

 
{code:java}
NAME  READY   UP-TO-DATE   AVAILABLE   AGE
trickle   2/2 22   6h25m
trickle-taskmanager   4/4 44   6h25m {code}
 

[1] 
[https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-standalone/src/main/java/org/apache/flink/kubernetes/operator/utils/StandaloneKubernetesUtils.java#L29-L38]
h3. Pod Names

Previously:
{code:java}
NAME                                    READY   STATUS      RESTARTS   AGE
trickle-job-manager-65d95d4854-lgmsm    1/1     Running     0          13d
trickle-job-manager-65d95d4854-vdzl8    1/1     Running     0          5d
trickle-task-manager-86c85cf647-46nxh   1/1     Running     0          5d
trickle-task-manager-86c85cf647-ct6c5   1/1     Running     0          5d
trickle-task-manager-86c85cf647-h894q   1/1     Running     0          5d
trickle-task-manager-86c85cf647-kpr5x   1/1     Running     0          5d{code}
New (Flink Operator):
{code:java}
NAME                                   READY   STATUS    RESTARTS   AGE
trickle-58f895675f-9m5wm               1/1     Running   0          25h
trickle-58f895675f-n4hhv               1/1     Running   0          25h
trickle-taskmanager-6f9f64b9b9-857lv   1/1     Running   0          25h
trickle-taskmanager-6f9f64b9b9-cnsrx   1/1     Running   0          25h{code}

The pod names stem from the deployment names, so a fix to update the deployment 
names may also fix the pod names.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class

2023-04-21 Thread Eric Xiao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Xiao updated FLINK-31873:
--
Description: 
When turning on Flink reactive mode, it is suggested to convert all 
{{setParallelism}} calls to {{setMaxParallelism}} from [elastic scaling 
docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].

With the current implementation of the {{DataStreamSink}} class, only the 
{{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
 function of the 
{{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
 class is exposed - {{Transformation}} also has the 
{{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
 function which is not exposed.

 

This means for any sink in the Flink pipeline, we cannot set a max parallelism.

  was:
When turning on Flink reactive mode, it is suggested to convert all 
{{setParallelism}} calls to {{setMaxParallelism from }}[elastic scaling 
docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].

With the current implementation of the {{DataStreamSink}} class, only the 
{{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
 function of the 
{{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
 class is exposed - {{Transformation}} also has the 
{{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
 function which is not exposed.

 

This means for any sink in the Flink pipeline, we cannot set a max parallelism.


> Add setMaxParallelism to the DataStreamSink Class
> -
>
> Key: FLINK-31873
> URL: https://issues.apache.org/jira/browse/FLINK-31873
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Eric Xiao
>Priority: Major
> Attachments: Screenshot 2023-04-20 at 4.33.14 PM.png
>
>
> When turning on Flink reactive mode, it is suggested to convert all 
> {{setParallelism}} calls to {{setMaxParallelism}} from [elastic scaling 
> docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].
> With the current implementation of the {{DataStreamSink}} class, only the 
> {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
>  function of the 
> {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
>  class is exposed - {{Transformation}} also has the 
> {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
>  function which is not exposed.
>  
> This means for any sink in the Flink pipeline, we cannot set a max 
> parallelism.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class

2023-04-21 Thread Eric Xiao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Xiao updated FLINK-31873:
--
Description: 
When turning on Flink reactive mode, it is suggested to convert all 
{{setParallelism}} calls to {{setMaxParallelism from }}[elastic scaling 
docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].

With the current implementation of the {{DataStreamSink}} class, only the 
{{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
 function of the 
{{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
 class is exposed - {{Transformation}} also has the 
{{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
 function which is not exposed.

 

This means for any sink in the Flink pipeline, we cannot set a max parallelism.

  was:
When turning on Flink reactive mode, it is suggested to convert all 
{{setParallelism}} calls to {{setMaxParallelism from the }}[elastic scaling 
docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].

With the current implementation of the {{DataStreamSink}} class, only the 
{{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
 function of the 
{{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
 class is exposed - {{Transformation}} also has the 
{{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
 function which is not exposed.


> Add setMaxParallelism to the DataStreamSink Class
> -
>
> Key: FLINK-31873
> URL: https://issues.apache.org/jira/browse/FLINK-31873
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Eric Xiao
>Priority: Major
> Attachments: Screenshot 2023-04-20 at 4.33.14 PM.png
>
>
> When turning on Flink reactive mode, it is suggested to convert all 
> {{setParallelism}} calls to {{setMaxParallelism from }}[elastic scaling 
> docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].
> With the current implementation of the {{DataStreamSink}} class, only the 
> {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
>  function of the 
> {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
>  class is exposed - {{Transformation}} also has the 
> {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
>  function which is not exposed.
>  
> This means for any sink in the Flink pipeline, we cannot set a max 
> parallelism.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class

2023-04-21 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17715028#comment-17715028
 ] 

Eric Xiao commented on FLINK-31873:
---

Thanks [~martijnvisser] and [~luoyuxia], I will start with opening up a thread 
in the Dev mailing list before making a FLIP :).

> Add setMaxParallelism to the DataStreamSink Class
> -
>
> Key: FLINK-31873
> URL: https://issues.apache.org/jira/browse/FLINK-31873
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Eric Xiao
>Priority: Major
> Attachments: Screenshot 2023-04-20 at 4.33.14 PM.png
>
>
> When turning on Flink reactive mode, it is suggested to convert all 
> {{setParallelism}} calls to {{setMaxParallelism from the }}[elastic scaling 
> docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].
> With the current implementation of the {{DataStreamSink}} class, only the 
> {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
>  function of the 
> {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
>  class is exposed - {{Transformation}} also has the 
> {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
>  function which is not exposed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class

2023-04-20 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17714745#comment-17714745
 ] 

Eric Xiao commented on FLINK-31873:
---

I have an open PR to address this issue: 
https://github.com/apache/flink/pull/22438

> Add setMaxParallelism to the DataStreamSink Class
> -
>
> Key: FLINK-31873
> URL: https://issues.apache.org/jira/browse/FLINK-31873
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Eric Xiao
>Priority: Blocker
> Attachments: Screenshot 2023-04-20 at 4.33.14 PM.png
>
>
> When turning on Flink reactive mode, it is suggested to convert all 
> {{setParallelism}} calls to {{setMaxParallelism from the }}[elastic scaling 
> docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].
> With the current implementation of the {{DataStreamSink}} class, only the 
> {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
>  function of the 
> {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
>  class is exposed - {{Transformation}} also has the 
> {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
>  function which is not exposed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class

2023-04-20 Thread Eric Xiao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Xiao updated FLINK-31873:
--
Attachment: Screenshot 2023-04-20 at 4.33.14 PM.png

> Add setMaxParallelism to the DataStreamSink Class
> -
>
> Key: FLINK-31873
> URL: https://issues.apache.org/jira/browse/FLINK-31873
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Eric Xiao
>Priority: Blocker
> Attachments: Screenshot 2023-04-20 at 4.33.14 PM.png
>
>
> When turning on Flink reactive mode, it is suggested to convert all 
> {{setParallelism}} calls to {{setMaxParallelism from the }}[elastic scaling 
> docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].
> With the current implementation of the {{DataStreamSink}} class, only the 
> {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
>  function of the 
> {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
>  class is exposed - {{Transformation}} also has the 
> {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
>  function which is not exposed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class

2023-04-20 Thread Eric Xiao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Xiao updated FLINK-31873:
--
Description: 
When turning on Flink reactive mode, it is suggested to convert all 
{{setParallelism}} calls to {{setMaxParallelism from the }}[elastic scaling 
docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].

With the current implementation of the {{DataStreamSink}} class, only the 
{{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
 function of the 
{{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
 class is exposed - {{Transformation}} also has the 
{{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
 function which is not exposed.

  was:
When turning on Flink reactive mode, it is suggested to convert all 
`setParallelism` calls to `setMaxParallelism`, [elastic 
scaling|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].

With the current implementation of the `DataStreamSink`, only the 
[`setParallelism` 
function|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]
 of the [`Transformation` class is exposed - `Transformation` also has the 
`setMaxParallelism` function which is not 
exposed|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285].


> Add setMaxParallelism to the DataStreamSink Class
> -
>
> Key: FLINK-31873
> URL: https://issues.apache.org/jira/browse/FLINK-31873
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Eric Xiao
>Priority: Blocker
>
> When turning on Flink reactive mode, it is suggested to convert all 
> {{setParallelism}} calls to {{setMaxParallelism from the }}[elastic scaling 
> docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].
> With the current implementation of the {{DataStreamSink}} class, only the 
> {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
>  function of the 
> {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
>  class is exposed - {{Transformation}} also has the 
> {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
>  function which is not exposed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class

2023-04-20 Thread Eric Xiao (Jira)
Eric Xiao created FLINK-31873:
-

 Summary: Add setMaxParallelism to the DataStreamSink Class
 Key: FLINK-31873
 URL: https://issues.apache.org/jira/browse/FLINK-31873
 Project: Flink
  Issue Type: Bug
  Components: API / DataStream
Reporter: Eric Xiao


When turning on Flink reactive mode, it is suggested to convert all 
`setParallelism` calls to `setMaxParallelism`, [elastic 
scaling|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].

With the current implementation of the `DataStreamSink`, only the 
[`setParallelism` 
function|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]
 of the [`Transformation` class is exposed - `Transformation` also has the 
`setMaxParallelism` function which is not 
exposed|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29073) [FLIP-91] Support SQL Gateway(Part 2)

2022-11-23 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17637941#comment-17637941
 ] 

Eric Xiao commented on FLINK-29073:
---

Thanks [~yzl]! Yes let's connect after the FLIP is accepted as well :).

Yes my team and I are interested in working on #5, is there some sort of 
timeline that the community had in mind to get these issues closed by?

I will take a look at your PRs to see what is needed to expose the 
`completeStatement` to the REST API.

> [FLIP-91] Support SQL Gateway(Part 2)
> -
>
> Key: FLINK-29073
> URL: https://issues.apache.org/jira/browse/FLINK-29073
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Client, Table SQL / Gateway
>Affects Versions: 1.17.0
>Reporter: Shengkai Fang
>Priority: Major
>
> Issue continues improving the SQL Gateway and allows the SQL Client submit 
> jobs to the SQL Gateway.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-21301) Decouple window aggregate allow lateness with state ttl configuration

2022-11-21 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636864#comment-17636864
 ] 

Eric Xiao edited comment on FLINK-21301 at 11/21/22 8:16 PM:
-

Hi there, I had a read of 
[https://www.mail-archive.com/user@flink.apache.org/msg43316.html] and was 
wondering if there was any particular reason why the allow-lateness 
configuration is not enabled for Window TVF aggregations? As I saw that Group 
Window Aggregation is deprecated [1].

Would it be something hard to implement? We might have bandwidth on our team to 
work on it.

 

[1] - 
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/window-agg/#group-window-aggregation


was (Author: JIRAUSER295489):
Hi there, I had a read of 
[https://www.mail-archive.com/user@flink.apache.org/msg43316.html] and was 
wondering if there was any particular reason why the allow-lateness 
configuration is not enabled for Window TVF aggregations?

> Decouple window aggregate allow lateness with state ttl configuration
> -
>
> Key: FLINK-21301
> URL: https://issues.apache.org/jira/browse/FLINK-21301
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: Jing Zhang
>Assignee: Jing Zhang
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.14.0
>
>
> Currently, state retention time config will also effect state clean behavior 
> of Window Aggregate, which is unexpected for most users.
> E.g for the following example,  User would set `MinIdleStateRetentionTime` to 
> 1 Day to clean state in `deduplicate` . However, it will also effects clean 
> behavior of window aggregate. For example, 2021-01-04 data would clean at 
> 2021-01-06 instead of 2021-01-05. 
> {code:sql}
> SELECT
>  DATE_FORMAT(tumble_end(ROWTIME ,interval '1' DAY),'-MM-dd') as stat_time,
>  count(1) first_phone_num
> FROM (
>  SELECT 
>  ROWTIME,
>  user_id,
>  row_number() over(partition by user_id, pdate order by ROWTIME ) as rn
>  FROM source_kafka_biz_shuidi_sdb_crm_call_record 
> ) cal 
> where rn =1
> group by tumble(ROWTIME,interval '1' DAY);{code}
> It's better to decouple window aggregate allow lateness with 
> `MinIdleStateRetentionTime` .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-21301) Decouple window aggregate allow lateness with state ttl configuration

2022-11-21 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636864#comment-17636864
 ] 

Eric Xiao commented on FLINK-21301:
---

Hi there, I had a read of 
[https://www.mail-archive.com/user@flink.apache.org/msg43316.html] and was 
wondering if there was any particular reason why the allow-lateness 
configuration is not enabled for Window TVF aggregations?

> Decouple window aggregate allow lateness with state ttl configuration
> -
>
> Key: FLINK-21301
> URL: https://issues.apache.org/jira/browse/FLINK-21301
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: Jing Zhang
>Assignee: Jing Zhang
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.14.0
>
>
> Currently, state retention time config will also effect state clean behavior 
> of Window Aggregate, which is unexpected for most users.
> E.g for the following example,  User would set `MinIdleStateRetentionTime` to 
> 1 Day to clean state in `deduplicate` . However, it will also effects clean 
> behavior of window aggregate. For example, 2021-01-04 data would clean at 
> 2021-01-06 instead of 2021-01-05. 
> {code:sql}
> SELECT
>  DATE_FORMAT(tumble_end(ROWTIME ,interval '1' DAY),'-MM-dd') as stat_time,
>  count(1) first_phone_num
> FROM (
>  SELECT 
>  ROWTIME,
>  user_id,
>  row_number() over(partition by user_id, pdate order by ROWTIME ) as rn
>  FROM source_kafka_biz_shuidi_sdb_crm_call_record 
> ) cal 
> where rn =1
> group by tumble(ROWTIME,interval '1' DAY);{code}
> It's better to decouple window aggregate allow lateness with 
> `MinIdleStateRetentionTime` .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29073) [FLIP-91] Support SQL Gateway(Part 2)

2022-11-21 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636842#comment-17636842
 ] 

Eric Xiao commented on FLINK-29073:
---

Hi there, are any of the tasks above good for someone new to the community to 
start working on?

> [FLIP-91] Support SQL Gateway(Part 2)
> -
>
> Key: FLINK-29073
> URL: https://issues.apache.org/jira/browse/FLINK-29073
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Client, Table SQL / Gateway
>Affects Versions: 1.17.0
>Reporter: Shengkai Fang
>Priority: Major
>
> Issue continues improving the SQL Gateway and allows the SQL Client submit 
> jobs to the SQL Gateway.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29837) SQL API does not expose the RowKind of the Row for processing Changelogs

2022-11-03 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17628369#comment-17628369
 ] 

Eric Xiao commented on FLINK-29837:
---

Hi [~luoyuxia] thanks for responding!

We are exploring the Flink SQL API and have been translating some of our 
DataStream API pipelines to Flink SQL. In our pipelines we have business logic 
that depend on the kind of changelog event this particular row is associated 
with.

I saw that there is the configuration parameter  [1] that you can pass in 
`ChangelogMode` to the `fromChangelogStream`, but I don't think that addresses 
our needs.

 [1] 
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/data_stream_api/#handling-of-changelog-streams

> SQL API does not expose the RowKind of the Row for processing Changelogs
> 
>
> Key: FLINK-29837
> URL: https://issues.apache.org/jira/browse/FLINK-29837
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.16.0
>Reporter: Eric Xiao
>Priority: Major
>
> When working with `{{{}ChangeLog{}}}` data in the SQL API it was a bit 
> misleading to see that the `{{{}op{}}}` column appears{^}[1]{^}  the type of  
> in the table schema of print results but it is not available to be used in a 
> the SQL API:
> {code:java}
> val tableEnv = StreamTableEnvironment.create(env)
> val dataStream = env.fromElements(
>   Row.ofKind(RowKind.INSERT, "Alice", Int.box(12)),
>   Row.ofKind(RowKind.INSERT, "Bob", Int.box(5)),
>   Row.ofKind(RowKind.UPDATE_AFTER, "Alice", Int.box(100))
> )(Types.ROW(Types.STRING, Types.INT))
> // interpret the DataStream as a Table
> val table =
>   tableEnv.fromChangelogStream(dataStream, 
> Schema.newBuilder().primaryKey("f0").build(), ChangelogMode.upsert())
> // register the table under a name and perform an aggregation
> tableEnv.createTemporaryView("InputTable", table)
> tableEnv
>   .sqlQuery("SELECT * FROM InputTable where op = '+I'")
>   .execute()
>   .print() {code}
> The error logs.
>  
>  
> {code:java}
> Exception in thread "main" org.apache.flink.table.api.ValidationException: 
> SQL validation failed. From line 1, column 32 to line 1, column 33: Column 
> 'op' not found in any table
>     at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:184)
>     at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:109)
>     at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:237)
>     at 
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:105)
>     at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:675){code}
> It would be nice to expose the `op` column to be usable in the Flink SQL APIs 
> as it is in the DataStream APIs.
> [1] 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/data_stream_api/#examples-for-fromchangelogstream]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29837) SQL API does not expose the RowKind of the Row for processing Changelogs

2022-11-03 Thread Eric Xiao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Xiao updated FLINK-29837:
--
Description: 
When working with `{{{}ChangeLog{}}}` data in the SQL API it was a bit 
misleading to see that the `{{{}op{}}}` column appears{^}[1]{^}  the type of  
in the table schema of print results but it is not available to be used in a 
the SQL API:
{code:java}
val tableEnv = StreamTableEnvironment.create(env)

val dataStream = env.fromElements(
  Row.ofKind(RowKind.INSERT, "Alice", Int.box(12)),
  Row.ofKind(RowKind.INSERT, "Bob", Int.box(5)),
  Row.ofKind(RowKind.UPDATE_AFTER, "Alice", Int.box(100))
)(Types.ROW(Types.STRING, Types.INT))

// interpret the DataStream as a Table
val table =
  tableEnv.fromChangelogStream(dataStream, 
Schema.newBuilder().primaryKey("f0").build(), ChangelogMode.upsert())

// register the table under a name and perform an aggregation
tableEnv.createTemporaryView("InputTable", table)
tableEnv
  .sqlQuery("SELECT * FROM InputTable where op = '+I'")
  .execute()
  .print() {code}
The error logs.

 

 
{code:java}
Exception in thread "main" org.apache.flink.table.api.ValidationException: SQL 
validation failed. From line 1, column 32 to line 1, column 33: Column 'op' not 
found in any table
    at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:184)
    at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:109)
    at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:237)
    at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:105)
    at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:675){code}
It would be nice to expose the `op` column to be usable in the Flink SQL APIs 
as it is in the DataStream APIs.

[1] 
[https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/data_stream_api/#examples-for-fromchangelogstream]
 

  was:
When working with `{{{}ChangeLog{}}}` data in the SQL API it was a bit 
misleading to see that the `{{{}op{}}}` column appears{^}[1]{^}  the type of  
in the table schema of print results but it is not available to be used in a 
the SQL API:
{code:java}
val tableEnv = StreamTableEnvironment.create(env)

val dataStream = env.fromElements(
  Row.ofKind(RowKind.INSERT, "Alice", Int.box(12)),
  Row.ofKind(RowKind.INSERT, "Bob", Int.box(5)),
  Row.ofKind(RowKind.UPDATE_AFTER, "Alice", Int.box(100))
)(Types.ROW(Types.STRING, Types.INT))

// interpret the DataStream as a Table
val table =
  tableEnv.fromChangelogStream(dataStream, 
Schema.newBuilder().primaryKey("f0").build(), ChangelogMode.upsert())

// register the table under a name and perform an aggregation
tableEnv.createTemporaryView("InputTable", table)
tableEnv
  .sqlQuery("SELECT * FROM InputTable where op = '+I'")
  .execute()
  .print() {code}
The error logs.

 

 
{code:java}
Exception in thread "main" org.apache.flink.table.api.ValidationException: SQL 
validation failed. From line 1, column 32 to line 1, column 33: Column 'op' not 
found in any table
    at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:184)
    at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:109)
    at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:237)
    at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:105)
    at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:675)
    at 
com.shopify.trickle.pipelines.IteratorPipeline$.delayedEndpoint$com$shopify$trickle$pipelines$IteratorPipeline$1(IteratorPipeline.scala:32)
    at 
com.shopify.trickle.pipelines.IteratorPipeline$delayedInit$body.apply(IteratorPipeline.scala:11)
    at scala.Function0.apply$mcV$sp(Function0.scala:39)
    at scala.Function0.apply$mcV$sp$(Function0.scala:39)
    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
    at scala.App.$anonfun$main$1$adapted(App.scala:80)
    at scala.collection.immutable.List.foreach(List.scala:431)
    at scala.App.main(App.scala:80)
    at scala.App.main$(App.scala:78)
    at 
com.shopify.trickle.pipelines.IteratorPipeline$.main(IteratorPipeline.scala:11)
    at 
com.shopify.trickle.pipelines.IteratorPipeline.main(IteratorPipeline.scala) 
{code}
It would be nice to expose the `op` column to be usable in the Flink SQL APIs 
as it is in the DataStream APIs.

[1] 
[https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/data_stream_api/#examples-for-fromchangelogstream]
 


> SQL API does not expose the RowKind of the Row for processing Changelogs
> 

[jira] [Created] (FLINK-29837) SQL API does not expose the RowKind of the Row for processing Changelogs

2022-11-01 Thread Eric Xiao (Jira)
Eric Xiao created FLINK-29837:
-

 Summary: SQL API does not expose the RowKind of the Row for 
processing Changelogs
 Key: FLINK-29837
 URL: https://issues.apache.org/jira/browse/FLINK-29837
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Affects Versions: 1.16.0
Reporter: Eric Xiao


When working with `{{{}ChangeLog{}}}` data in the SQL API it was a bit 
misleading to see that the `{{{}op{}}}` column appears{^}[1]{^}  the type of  
in the table schema of print results but it is not available to be used in a 
the SQL API:
{code:java}
val tableEnv = StreamTableEnvironment.create(env)

val dataStream = env.fromElements(
  Row.ofKind(RowKind.INSERT, "Alice", Int.box(12)),
  Row.ofKind(RowKind.INSERT, "Bob", Int.box(5)),
  Row.ofKind(RowKind.UPDATE_AFTER, "Alice", Int.box(100))
)(Types.ROW(Types.STRING, Types.INT))

// interpret the DataStream as a Table
val table =
  tableEnv.fromChangelogStream(dataStream, 
Schema.newBuilder().primaryKey("f0").build(), ChangelogMode.upsert())

// register the table under a name and perform an aggregation
tableEnv.createTemporaryView("InputTable", table)
tableEnv
  .sqlQuery("SELECT * FROM InputTable where op = '+I'")
  .execute()
  .print() {code}
The error logs.

 

 
{code:java}
Exception in thread "main" org.apache.flink.table.api.ValidationException: SQL 
validation failed. From line 1, column 32 to line 1, column 33: Column 'op' not 
found in any table
    at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:184)
    at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:109)
    at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:237)
    at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:105)
    at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:675)
    at 
com.shopify.trickle.pipelines.IteratorPipeline$.delayedEndpoint$com$shopify$trickle$pipelines$IteratorPipeline$1(IteratorPipeline.scala:32)
    at 
com.shopify.trickle.pipelines.IteratorPipeline$delayedInit$body.apply(IteratorPipeline.scala:11)
    at scala.Function0.apply$mcV$sp(Function0.scala:39)
    at scala.Function0.apply$mcV$sp$(Function0.scala:39)
    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
    at scala.App.$anonfun$main$1$adapted(App.scala:80)
    at scala.collection.immutable.List.foreach(List.scala:431)
    at scala.App.main(App.scala:80)
    at scala.App.main$(App.scala:78)
    at 
com.shopify.trickle.pipelines.IteratorPipeline$.main(IteratorPipeline.scala:11)
    at 
com.shopify.trickle.pipelines.IteratorPipeline.main(IteratorPipeline.scala) 
{code}
It would be nice to expose the `op` column to be usable in the Flink SQL APIs 
as it is in the DataStream APIs.

[1] 
[https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/data_stream_api/#examples-for-fromchangelogstream]
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-26 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154
 ] 

Eric Xiao edited comment on FLINK-20578 at 10/26/22 7:24 PM:
-

Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think what we can do is:
 # (Look below for examples from other databases) Set a default data type when 
the empty array is created: Two options for the default data type:
 ## Use or create a new empty/void data type.
 ## Use an existing data type i.e. Integer.
 # Teach operations such as `COALESCE` how to type coercion from the default 
data type to the desired data type (a separate PR).

I think Step 1 is sufficient enough to unblock users in creating an empty array 
for the time being but Step 2 is required to allow seamless SQL experience. 
Without step two users will most likely have to manually convert their empty 
array's data type.

With step 1:

 
{code:java}
SELECT COALESCE(int_column, CAST(ARRAY[] as INT)){code}
 

With step 1 and 2:
{code:java}
SELECT COALESCE(int_column, ARRAY[]) {code}
*Default Data Type*
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

!Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340!

They use unknown datatype

Spark:

!image-2022-10-26-14-42-08-468.png|width=505,height=112!

They use unknown datatype

BigQuery:
!Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!

!image-2022-10-26-14-42-57-579.png|width=373,height=125!
They use integer datatype


was (Author: JIRAUSER295489):
Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think what we can do is:
 # (Look below for examples from other databases) Set a default data type when 
the empty array is created: Two options for the default data type:
 ## Use or create a new empty/void data type.
 ## Use an existing data type i.e. Integer.
 # Teach operations such as `COALESCE` how to type coercion from the default 
data type to the desired data type (a separate PR).

Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

!Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340!

They use unknown datatype

Spark:

!image-2022-10-26-14-42-08-468.png|width=505,height=112!

They use unknown datatype

BigQuery:
!Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!

!image-2022-10-26-14-42-57-579.png|width=373,height=125!
They use integer datatype

> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png, 
> Screen Shot 2022-10-26 at 2.28.49 PM.png, image-2022-10-26-14-42-08-468.png, 
> image-2022-10-26-14-42-57-579.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-26 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154
 ] 

Eric Xiao edited comment on FLINK-20578 at 10/26/22 7:20 PM:
-

Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think what we can do is:
 # (Look below for examples from other databases) Set a default data type when 
the empty array is created: Two options for the default data type:
 ## Use or create a new empty/void data type.
 ## Use an existing data type i.e. Integer.
 # Teach operations such as `COALESCE` how to type coercion from the default 
data type to the desired data type (a separate PR).

Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

!Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340!

They use unknown datatype

Spark:

!image-2022-10-26-14-42-08-468.png|width=505,height=112!

They use unknown datatype

BigQuery:
!Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!

!image-2022-10-26-14-42-57-579.png|width=373,height=125!
They use integer datatype


was (Author: JIRAUSER295489):
Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that.

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

!Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340!

They use unknown datatype

Spark:

!image-2022-10-26-14-42-08-468.png|width=505,height=112!

They use unknown datatype

BigQuery:
!Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!

!image-2022-10-26-14-42-57-579.png|width=373,height=125!
They use integer datatype

> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png, 
> Screen Shot 2022-10-26 at 2.28.49 PM.png, image-2022-10-26-14-42-08-468.png, 
> image-2022-10-26-14-42-57-579.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-26 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154
 ] 

Eric Xiao edited comment on FLINK-20578 at 10/26/22 6:43 PM:
-

Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that.

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

!Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340!

They use unknown datatype

Spark:

!image-2022-10-26-14-42-08-468.png|width=505,height=112!

They use unknown datatype

BigQuery:
!Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!

!image-2022-10-26-14-42-57-579.png|width=373,height=125!
They use integer datatype


was (Author: JIRAUSER295489):
Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that.

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

!Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340!

They use unknown datatype

BigQuery:
!Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!!Screen Shot 
2022-10-25 at 10.50.42 PM.png|width=761,height=148!
They use integer datatype

> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png, 
> Screen Shot 2022-10-26 at 2.28.49 PM.png, image-2022-10-26-14-42-08-468.png, 
> image-2022-10-26-14-42-57-579.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-26 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154
 ] 

Eric Xiao edited comment on FLINK-20578 at 10/26/22 6:29 PM:
-

Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that.

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

!Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340!

They use unknown datatype

BigQuery:
!Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!!Screen Shot 
2022-10-25 at 10.50.42 PM.png|width=761,height=148!
They use integer datatype


was (Author: JIRAUSER295489):
Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that. 

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

They use unknown datatype

BigQuery:

They use integer datatype

a similar query in Trino (Presto) and BigQuery and they use a data type Integer 
as the data type. This could be a good default behaviour?

!Screen Shot 2022-10-25 at 10.50.42 PM.png!
!Screen Shot 2022-10-25 at 10.50.47 PM.png! 
!Screen Shot 2022-10-25 at 11.01.06 PM.png! 


> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png, 
> Screen Shot 2022-10-26 at 2.28.49 PM.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-26 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154
 ] 

Eric Xiao edited comment on FLINK-20578 at 10/26/22 6:27 PM:
-

Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that. 

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

They use unknown datatype

BigQuery:

They use integer datatype

a similar query in Trino (Presto) and BigQuery and they use a data type Integer 
as the data type. This could be a good default behaviour?

!Screen Shot 2022-10-25 at 10.50.42 PM.png!
!Screen Shot 2022-10-25 at 10.50.47 PM.png! 
!Screen Shot 2022-10-25 at 11.01.06 PM.png! 



was (Author: JIRAUSER295489):
Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that. 

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested a similar query in Trino (Presto) and BigQuery and they use a data type 
Integer as the data type. This could be a good default behaviour?

!Screen Shot 2022-10-25 at 10.50.42 PM.png!
!Screen Shot 2022-10-25 at 10.50.47 PM.png! 
!Screen Shot 2022-10-25 at 11.01.06 PM.png! 


> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-25 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154
 ] 

Eric Xiao edited comment on FLINK-20578 at 10/26/22 4:19 AM:
-

Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that. 

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested a similar query in Trino (Presto) and BigQuery and they use a data type 
Integer as the data type. This could be a good default behaviour?

!Screen Shot 2022-10-25 at 10.50.42 PM.png!
!Screen Shot 2022-10-25 at 10.50.47 PM.png! 
!Screen Shot 2022-10-25 at 11.01.06 PM.png! 



was (Author: JIRAUSER295489):
Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue, I noticed it has been a year since your last comment.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that. 

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested a similar query in Trino (Presto) and BigQuery and they use a data type 
Integer as the data type. This could be a good default behaviour?

!Screen Shot 2022-10-25 at 10.50.42 PM.png!
!Screen Shot 2022-10-25 at 10.50.47 PM.png! 
!Screen Shot 2022-10-25 at 11.01.06 PM.png! 


> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-25 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154
 ] 

Eric Xiao edited comment on FLINK-20578 at 10/26/22 3:53 AM:
-

Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue, I noticed it has been a year since your last comment.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that. 

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested a similar query in Trino (Presto) and BigQuery and they use a data type 
Integer as the data type. This could be a good default behaviour?

!Screen Shot 2022-10-25 at 10.50.42 PM.png!
!Screen Shot 2022-10-25 at 10.50.47 PM.png! 
!Screen Shot 2022-10-25 at 11.01.06 PM.png! 



was (Author: JIRAUSER295489):
Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue, I noticed it has been a year since your last comment.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I tested a similar query in Trino (Presto) and BigQuery and they by 
default use Integer as the data type. This could be a good default behaviour?

!Screen Shot 2022-10-25 at 10.50.42 PM.png!
!Screen Shot 2022-10-25 at 10.50.47 PM.png! 
!Screen Shot 2022-10-25 at 11.01.06 PM.png! 


> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-25 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154
 ] 

Eric Xiao edited comment on FLINK-20578 at 10/26/22 3:04 AM:
-

Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue, I noticed it has been a year since your last comment.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I tested a similar query in Trino (Presto) and BigQuery and they by 
default use Integer as the data type. This could be a good default behaviour?

!Screen Shot 2022-10-25 at 10.50.42 PM.png!
!Screen Shot 2022-10-25 at 10.50.47 PM.png! 
!Screen Shot 2022-10-25 at 11.01.06 PM.png! 



was (Author: JIRAUSER295489):
Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue, I noticed it has been a year since your last comment.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I tested a similar query in Trino (Presto) and BigQuery and they by 
default use Integer as the data type. This could be a good default behaviour?

!Screen Shot 2022-10-25 at 10.50.42 PM.png!


> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-25 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154
 ] 

Eric Xiao commented on FLINK-20578:
---

Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue, I noticed it has been a year since your last comment.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I tested a similar query in Trino (Presto) and BigQuery and they by 
default use Integer as the data type. This could be a good default behaviour?

!Screen Shot 2022-10-25 at 10.50.42 PM.png!


> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-25 Thread Eric Xiao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Xiao updated FLINK-20578:
--
Attachment: Screen Shot 2022-10-25 at 10.50.42 PM.png
Screen Shot 2022-10-25 at 10.50.47 PM.png
Screen Shot 2022-10-25 at 11.01.06 PM.png

> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API

2022-10-20 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621332#comment-17621332
 ] 

Eric Xiao commented on FLINK-29498:
---

[~gaoyunhaii], I believe I have a 
[PR|https://github.com/apache/flink/pull/21077] that is in a reviewable state, 
what is the process of getting folks from the community to review the PR?

> Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
> --
>
> Key: FLINK-29498
> URL: https://issues.apache.org/jira/browse/FLINK-29498
> Project: Flink
>  Issue Type: Bug
>  Components: API / Scala
>Affects Versions: 1.15.2
>Reporter: Eric Xiao
>Assignee: Eric Xiao
>Priority: Minor
>
> We are using the async I/O to make HTTP calls and one of the features we 
> wanted to leverage was the retries, so we pulled the newest commit: 
> [http://github.com/apache/flink/pull/19983] into our internal Flink fork.
> When I try calling the function {{AsyncDataStream.unorderedWaitWithRetry}} 
> from the scala API I with a retry strategy from the java API I get an error 
> as {{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is 
> that retry strategies were only implemented in java and not Scala in this PR: 
> [http://github.com/apache/flink/pull/19983].
>  
> Here is some of the code to reproduce the error:
> {code:java}
> import org.apache.flink.streaming.api.scala.AsyncDataStream
> import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => 
> JAsyncRetryStrategies}
> val javaAsyncRetryStrategy = new 
> JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L)
> .build()
> val data = AsyncDataStream.unorderedWaitWithRetry(
>   source,
>   asyncOperator,
>   pipelineTimeoutInMs,
>   TimeUnit.MILLISECONDS,
>   javaAsyncRetryStrategy
> ){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API

2022-10-20 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621332#comment-17621332
 ] 

Eric Xiao edited comment on FLINK-29498 at 10/20/22 7:30 PM:
-

Hi [~gaoyunhaii], I believe I have a 
[PR|https://github.com/apache/flink/pull/21077] that is in a reviewable state, 
what is the process of getting folks from the community to review the PR?


was (Author: JIRAUSER295489):
[~gaoyunhaii], I believe I have a 
[PR|https://github.com/apache/flink/pull/21077] that is in a reviewable state, 
what is the process of getting folks from the community to review the PR?

> Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
> --
>
> Key: FLINK-29498
> URL: https://issues.apache.org/jira/browse/FLINK-29498
> Project: Flink
>  Issue Type: Bug
>  Components: API / Scala
>Affects Versions: 1.15.2
>Reporter: Eric Xiao
>Assignee: Eric Xiao
>Priority: Minor
>
> We are using the async I/O to make HTTP calls and one of the features we 
> wanted to leverage was the retries, so we pulled the newest commit: 
> [http://github.com/apache/flink/pull/19983] into our internal Flink fork.
> When I try calling the function {{AsyncDataStream.unorderedWaitWithRetry}} 
> from the scala API I with a retry strategy from the java API I get an error 
> as {{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is 
> that retry strategies were only implemented in java and not Scala in this PR: 
> [http://github.com/apache/flink/pull/19983].
>  
> Here is some of the code to reproduce the error:
> {code:java}
> import org.apache.flink.streaming.api.scala.AsyncDataStream
> import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => 
> JAsyncRetryStrategies}
> val javaAsyncRetryStrategy = new 
> JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L)
> .build()
> val data = AsyncDataStream.unorderedWaitWithRetry(
>   source,
>   asyncOperator,
>   pipelineTimeoutInMs,
>   TimeUnit.MILLISECONDS,
>   javaAsyncRetryStrategy
> ){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-25595) Specify hash/sort aggregate strategy in SQL hint

2022-10-20 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621193#comment-17621193
 ] 

Eric Xiao commented on FLINK-25595:
---

Hi [~godfreyhe] and [~jingzhang], I would like to get more involved in the 
Flink community, how can someone new to the code get familiar with this part of 
the code base? I see that there is still an open issue that is related to this 
one.

I am familiar with the (using this word lightly) on how query optimizations 
work through using other systems like Spark and Trino.

> Specify hash/sort aggregate strategy in SQL hint
> 
>
> Key: FLINK-25595
> URL: https://issues.apache.org/jira/browse/FLINK-25595
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Jing Zhang
>Assignee: ZhuoYu Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API

2022-10-16 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17618378#comment-17618378
 ] 

Eric Xiao commented on FLINK-29498:
---

As discussed in the [slack 
thread|https://apache-flink.slack.com/archives/C03G7LJTS2G/p1663957561957419] I 
am eager to contribute the Scala util classes :). Should we assign the Jira 
ticket to me (Not sure how to do that)?

> Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
> --
>
> Key: FLINK-29498
> URL: https://issues.apache.org/jira/browse/FLINK-29498
> Project: Flink
>  Issue Type: Bug
>  Components: API / Scala
>Affects Versions: 1.15.2
>Reporter: Eric Xiao
>Priority: Minor
>
> We are using the async I/O to make HTTP calls and one of the features we 
> wanted to leverage was the retries, so we pulled the newest commit: 
> [http://github.com/apache/flink/pull/19983] into our internal Flink fork.
> When I try calling the function {{AsyncDataStream.unorderedWaitWithRetry}} 
> from the scala API I with a retry strategy from the java API I get an error 
> as {{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is 
> that retry strategies were only implemented in java and not Scala in this PR: 
> [http://github.com/apache/flink/pull/19983].
>  
> Here is some of the code to reproduce the error:
> {code:java}
> import org.apache.flink.streaming.api.scala.AsyncDataStream
> import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => 
> JAsyncRetryStrategies}
> val javaAsyncRetryStrategy = new 
> JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L)
> .build()
> val data = AsyncDataStream.unorderedWaitWithRetry(
>   source,
>   asyncOperator,
>   pipelineTimeoutInMs,
>   TimeUnit.MILLISECONDS,
>   javaAsyncRetryStrategy
> ){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API

2022-10-04 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17612625#comment-17612625
 ] 

Eric Xiao edited comment on FLINK-29498 at 10/4/22 2:03 PM:


> For your case, you can try to implement your own `AsyncRetryStrategy` to 
> enable retries.

Thanks [~lincoln.86xy] for replying :), this is what we have done so far, 
thankfully the code is not a lot but we were wondering if there was a reason 
why the `AsyncRetryStrategies` are only available in Java API and not the Scala 
API? Similar for the `RetryPredicates`?

 

I am fairly new to Flink, but I believe I have seem some `.toJava` and 
`.toScala` helper methods on other Flink components and was wondering if there 
is room to add the same such functionality to the retry strategies builder and 
retry predicates?


was (Author: JIRAUSER295489):
[~lincoln.86xy], is there a reason why the `AsyncRetryStrategies` are only 
available in Java API and not the Scala API? Similar for the `RetryPredicates`?

 

I am fairly new to Flink, but I believe I have seem some `.toJava` and 
`.toScala` helper methods on other Flink components and was wondering if there 
is room to add the same such functionality to the retry strategies builder and 
retry predicates?

> Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
> --
>
> Key: FLINK-29498
> URL: https://issues.apache.org/jira/browse/FLINK-29498
> Project: Flink
>  Issue Type: Bug
>  Components: API / Scala
>Affects Versions: 1.15.2
>Reporter: Eric Xiao
>Priority: Minor
>
> We are using the async I/O to make HTTP calls and one of the features we 
> wanted to leverage was the retries, so we pulled the newest commit: 
> [http://github.com/apache/flink/pull/19983] into our internal Flink fork.
> When I try calling the function {{AsyncDataStream.unorderedWaitWithRetry}} 
> from the scala API I with a retry strategy from the java API I get an error 
> as {{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is 
> that retry strategies were only implemented in java and not Scala in this PR: 
> [http://github.com/apache/flink/pull/19983].
>  
> Here is some of the code to reproduce the error:
> {code:java}
> import org.apache.flink.streaming.api.scala.AsyncDataStream
> import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => 
> JAsyncRetryStrategies}
> val javaAsyncRetryStrategy = new 
> JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L)
> .build()
> val data = AsyncDataStream.unorderedWaitWithRetry(
>   source,
>   asyncOperator,
>   pipelineTimeoutInMs,
>   TimeUnit.MILLISECONDS,
>   javaAsyncRetryStrategy
> ){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API

2022-10-04 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17612625#comment-17612625
 ] 

Eric Xiao commented on FLINK-29498:
---

[~lincoln.86xy], is there a reason why the `AsyncRetryStrategies` are only 
available in Java API and not the Scala API? Similar for the `RetryPredicates`?

 

I am fairly new to Flink, but I believe I have seem some `.toJava` and 
`.toScala` helper methods on other Flink components and was wondering if there 
is room to add the same such functionality to the retry strategies builder and 
retry predicates?

> Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
> --
>
> Key: FLINK-29498
> URL: https://issues.apache.org/jira/browse/FLINK-29498
> Project: Flink
>  Issue Type: Bug
>  Components: API / Scala
>Affects Versions: 1.15.2
>Reporter: Eric Xiao
>Priority: Minor
>
> We are using the async I/O to make HTTP calls and one of the features we 
> wanted to leverage was the retries, so we pulled the newest commit: 
> [http://github.com/apache/flink/pull/19983] into our internal Flink fork.
> When I try calling the function {{AsyncDataStream.unorderedWaitWithRetry}} 
> from the scala API I with a retry strategy from the java API I get an error 
> as {{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is 
> that retry strategies were only implemented in java and not Scala in this PR: 
> [http://github.com/apache/flink/pull/19983].
>  
> Here is some of the code to reproduce the error:
> {code:java}
> import org.apache.flink.streaming.api.scala.AsyncDataStream
> import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => 
> JAsyncRetryStrategies}
> val javaAsyncRetryStrategy = new 
> JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L)
> .build()
> val data = AsyncDataStream.unorderedWaitWithRetry(
>   source,
>   asyncOperator,
>   pipelineTimeoutInMs,
>   TimeUnit.MILLISECONDS,
>   javaAsyncRetryStrategy
> ){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API

2022-10-03 Thread Eric Xiao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Xiao updated FLINK-29498:
--
Description: 
We are using the async I/O to make HTTP calls and one of the features we wanted 
to leverage was the retries, so we pulled the newest commit: 
[http://github.com/apache/flink/pull/19983] into our internal Flink fork.

When I try calling the function {{AsyncDataStream.unorderedWaitWithRetry}} from 
the scala API I with a retry strategy from the java API I get an error as 
{{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is that 
retry strategies were only implemented in java and not Scala in this PR: 
[http://github.com/apache/flink/pull/19983].

 

Here is some of the code to reproduce the error:
{code:java}
import org.apache.flink.streaming.api.scala.AsyncDataStream
import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => 
JAsyncRetryStrategies}

val javaAsyncRetryStrategy = new 
JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L)
.build()

val data = AsyncDataStream.unorderedWaitWithRetry(
  source,
  asyncOperator,
  pipelineTimeoutInMs,
  TimeUnit.MILLISECONDS,
  javaAsyncRetryStrategy
){code}

  was:
When I try calling the function \{{AsyncDataStream.unorderedWaitWithRetry}} 
from the scala API I with a retry strategy from the java API I get an error as 
\{{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is that 
retry strategies were only implemented in java and not Scala in this PR: 
[http://github.com/apache/flink/pull/19983].
{code:java}
import org.apache.flink.streaming.api.scala.AsyncDataStream
import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => 
JAsyncRetryStrategies}

val javaAsyncRetryStrategy = new 
JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L)
.build()

val data = AsyncDataStream.unorderedWaitWithRetry(
  source,
  asyncOperator,
  pipelineTimeoutInMs,
  TimeUnit.MILLISECONDS,
  javaAsyncRetryStrategy
){code}


> Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
> --
>
> Key: FLINK-29498
> URL: https://issues.apache.org/jira/browse/FLINK-29498
> Project: Flink
>  Issue Type: Bug
>  Components: API / Scala
>Affects Versions: 1.15.2
>Reporter: Eric Xiao
>Priority: Minor
>
> We are using the async I/O to make HTTP calls and one of the features we 
> wanted to leverage was the retries, so we pulled the newest commit: 
> [http://github.com/apache/flink/pull/19983] into our internal Flink fork.
> When I try calling the function {{AsyncDataStream.unorderedWaitWithRetry}} 
> from the scala API I with a retry strategy from the java API I get an error 
> as {{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is 
> that retry strategies were only implemented in java and not Scala in this PR: 
> [http://github.com/apache/flink/pull/19983].
>  
> Here is some of the code to reproduce the error:
> {code:java}
> import org.apache.flink.streaming.api.scala.AsyncDataStream
> import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => 
> JAsyncRetryStrategies}
> val javaAsyncRetryStrategy = new 
> JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L)
> .build()
> val data = AsyncDataStream.unorderedWaitWithRetry(
>   source,
>   asyncOperator,
>   pipelineTimeoutInMs,
>   TimeUnit.MILLISECONDS,
>   javaAsyncRetryStrategy
> ){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API

2022-10-03 Thread Eric Xiao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Xiao updated FLINK-29498:
--
Description: 
When I try calling the function \{{AsyncDataStream.unorderedWaitWithRetry}} 
from the scala API I with a retry strategy from the java API I get an error as 
\{{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is that 
retry strategies were only implemented in java and not Scala in this PR: 
[http://github.com/apache/flink/pull/19983].
{code:java}
import org.apache.flink.streaming.api.scala.AsyncDataStream
import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => 
JAsyncRetryStrategies}

val javaAsyncRetryStrategy = new 
JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L)
.build()

val data = AsyncDataStream.unorderedWaitWithRetry(
  source,
  asyncOperator,
  pipelineTimeoutInMs,
  TimeUnit.MILLISECONDS,
  javaAsyncRetryStrategy
){code}

  was:
When I try calling the function `AsyncDataStream.unorderedWaitWithRetry` from 
the scala API I with a retry strategy from the java API I get an error as 
`unorderedWaitWithRetry` expects a scala retry strategy. The problem is that 
retry strategies were only implemented in java and not Scala in this PR: 
http://github.com/apache/flink/pull/19983.
{code:java}
import org.apache.flink.streaming.api.scala.AsyncDataStream
import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => 
JAsyncRetryStrategies}

val javaAsyncRetryStrategy = new 
JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L)
.build()

val data = AsyncDataStream.unorderedWaitWithRetry(
  source,
  asyncOperator,
  pipelineTimeoutInMs,
  TimeUnit.MILLISECONDS,
  javaAsyncRetryStrategy
){code}


> Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
> --
>
> Key: FLINK-29498
> URL: https://issues.apache.org/jira/browse/FLINK-29498
> Project: Flink
>  Issue Type: Bug
>  Components: API / Scala
>Affects Versions: 1.15.2
>Reporter: Eric Xiao
>Priority: Minor
>
> When I try calling the function \{{AsyncDataStream.unorderedWaitWithRetry}} 
> from the scala API I with a retry strategy from the java API I get an error 
> as \{{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is 
> that retry strategies were only implemented in java and not Scala in this PR: 
> [http://github.com/apache/flink/pull/19983].
> {code:java}
> import org.apache.flink.streaming.api.scala.AsyncDataStream
> import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => 
> JAsyncRetryStrategies}
> val javaAsyncRetryStrategy = new 
> JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L)
> .build()
> val data = AsyncDataStream.unorderedWaitWithRetry(
>   source,
>   asyncOperator,
>   pipelineTimeoutInMs,
>   TimeUnit.MILLISECONDS,
>   javaAsyncRetryStrategy
> ){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API

2022-10-03 Thread Eric Xiao (Jira)
Eric Xiao created FLINK-29498:
-

 Summary: Flink Async I/O Retry Strategies Do Not Work for Scala 
AsyncDataStream API
 Key: FLINK-29498
 URL: https://issues.apache.org/jira/browse/FLINK-29498
 Project: Flink
  Issue Type: Bug
  Components: API / Scala
Affects Versions: 1.15.2
Reporter: Eric Xiao


When I try calling the function `AsyncDataStream.unorderedWaitWithRetry` from 
the scala API I with a retry strategy from the java API I get an error as 
`unorderedWaitWithRetry` expects a scala retry strategy. The problem is that 
retry strategies were only implemented in java and not Scala in this PR: 
http://github.com/apache/flink/pull/19983.
{code:java}
import org.apache.flink.streaming.api.scala.AsyncDataStream
import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => 
JAsyncRetryStrategies}

val javaAsyncRetryStrategy = new 
JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L)
.build()

val data = AsyncDataStream.unorderedWaitWithRetry(
  source,
  asyncOperator,
  pipelineTimeoutInMs,
  TimeUnit.MILLISECONDS,
  javaAsyncRetryStrategy
){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)