[jira] [Commented] (FLINK-25234) Flink should parse ISO timestamp in UTC format
[ https://issues.apache.org/jira/browse/FLINK-25234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845049#comment-17845049 ] Eric Xiao commented on FLINK-25234: --- We just hit similar errors when trying to consume JSON data with our timestamp format set to `json.timestamp-format.standard=ISO-8601` and the timestamp data to be of SQL format (and vice-versa). We have found this to happen on many occasions where their is a mismatch between `json.timestamp-format.standard` and the timestamp data in the JSON, which has led to some friction and confusion from our users with regards to what is wrong, since it's not too clear from the error message. Is there a way we can automate the selection of timestamp formats? i.e. would it make sense/be a good feature to have a way to tell Flink to parse a timestamp with all supported formats? i.e. `json.timestamp-format.parse-style=\{strict, lenient}` where strict would be the default to ensure backwards compatibility? Would love some feedback from the community / [~twalthr], I think we can definitely reduce some friction here for the users. > Flink should parse ISO timestamp in UTC format > -- > > Key: FLINK-25234 > URL: https://issues.apache.org/jira/browse/FLINK-25234 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / API >Affects Versions: 1.14.0 >Reporter: Egor Ryashin >Priority: Major > Original Estimate: 24h > Remaining Estimate: 24h > > Error parsing timestamp with ISO-8601 format: > {code:java} > [ERROR] Could not execute SQL statement. Reason: > java.time.format.DateTimeParseException: Text '2021-12-08T12:59:57.028Z' > could not be parsed, unparsed text found at index 23 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31941) Not backwards compatible naming for some kubernetes resources
Eric Xiao created FLINK-31941: - Summary: Not backwards compatible naming for some kubernetes resources Key: FLINK-31941 URL: https://issues.apache.org/jira/browse/FLINK-31941 Project: Flink Issue Type: Bug Components: Kubernetes Operator Reporter: Eric Xiao We are in the process of migrating all our workloads over to the Kubernetes operator and noticed that some of the Kubernetes resources in the operator are hardcoded and not consistent with how Flink previously defined them. This is leading to us some downstream incompatibilities and some migration toil in our monitoring and dashboards that have queries depending on the previous naming schema. I couldn't find exact definitions a task-manager or job-manager in the flink repo, but this is what I have noticed, I may be wrong on my interpretations . h3. Deployment Names Previously: {code:java} NAME READY UP-TO-DATE AVAILABLE AGE trickle-job-manager2/2 22 13d trickle-task-manager 10/10 10 10 13d {code} New (Flink Operator): {code:java} NAME READY UP-TO-DATE AVAILABLE AGE trickle 2/2 22 6h25m trickle-taskmanager 4/4 44 6h25m {code} [1] [https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-standalone/src/main/java/org/apache/flink/kubernetes/operator/utils/StandaloneKubernetesUtils.java#L29-L38] h3. Pod Names Previously: {code:java} NAME READY STATUS RESTARTS AGE trickle-job-manager-65d95d4854-lgmsm 1/1 Running 0 13d trickle-job-manager-65d95d4854-vdzl8 1/1 Running 0 5d trickle-task-manager-86c85cf647-46nxh 1/1 Running 0 5d trickle-task-manager-86c85cf647-ct6c5 1/1 Running 0 5d trickle-task-manager-86c85cf647-h894q 1/1 Running 0 5d trickle-task-manager-86c85cf647-kpr5x 1/1 Running 0 5d{code} New (Flink Operator): {code:java} NAME READY STATUS RESTARTS AGE trickle-58f895675f-9m5wm 1/1 Running 0 25h trickle-58f895675f-n4hhv 1/1 Running 0 25h trickle-taskmanager-6f9f64b9b9-857lv 1/1 Running 0 25h trickle-taskmanager-6f9f64b9b9-cnsrx 1/1 Running 0 25h{code} The pod names stem from the deployment names, so a fix to update the deployment names may also fix the pod names. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class
[ https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Xiao updated FLINK-31873: -- Description: When turning on Flink reactive mode, it is suggested to convert all {{setParallelism}} calls to {{setMaxParallelism}} from [elastic scaling docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration]. With the current implementation of the {{DataStreamSink}} class, only the {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}} function of the {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}} class is exposed - {{Transformation}} also has the {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}} function which is not exposed. This means for any sink in the Flink pipeline, we cannot set a max parallelism. was: When turning on Flink reactive mode, it is suggested to convert all {{setParallelism}} calls to {{setMaxParallelism from }}[elastic scaling docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration]. With the current implementation of the {{DataStreamSink}} class, only the {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}} function of the {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}} class is exposed - {{Transformation}} also has the {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}} function which is not exposed. This means for any sink in the Flink pipeline, we cannot set a max parallelism. > Add setMaxParallelism to the DataStreamSink Class > - > > Key: FLINK-31873 > URL: https://issues.apache.org/jira/browse/FLINK-31873 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Reporter: Eric Xiao >Priority: Major > Attachments: Screenshot 2023-04-20 at 4.33.14 PM.png > > > When turning on Flink reactive mode, it is suggested to convert all > {{setParallelism}} calls to {{setMaxParallelism}} from [elastic scaling > docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration]. > With the current implementation of the {{DataStreamSink}} class, only the > {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}} > function of the > {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}} > class is exposed - {{Transformation}} also has the > {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}} > function which is not exposed. > > This means for any sink in the Flink pipeline, we cannot set a max > parallelism. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class
[ https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Xiao updated FLINK-31873: -- Description: When turning on Flink reactive mode, it is suggested to convert all {{setParallelism}} calls to {{setMaxParallelism from }}[elastic scaling docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration]. With the current implementation of the {{DataStreamSink}} class, only the {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}} function of the {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}} class is exposed - {{Transformation}} also has the {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}} function which is not exposed. This means for any sink in the Flink pipeline, we cannot set a max parallelism. was: When turning on Flink reactive mode, it is suggested to convert all {{setParallelism}} calls to {{setMaxParallelism from the }}[elastic scaling docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration]. With the current implementation of the {{DataStreamSink}} class, only the {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}} function of the {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}} class is exposed - {{Transformation}} also has the {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}} function which is not exposed. > Add setMaxParallelism to the DataStreamSink Class > - > > Key: FLINK-31873 > URL: https://issues.apache.org/jira/browse/FLINK-31873 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Reporter: Eric Xiao >Priority: Major > Attachments: Screenshot 2023-04-20 at 4.33.14 PM.png > > > When turning on Flink reactive mode, it is suggested to convert all > {{setParallelism}} calls to {{setMaxParallelism from }}[elastic scaling > docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration]. > With the current implementation of the {{DataStreamSink}} class, only the > {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}} > function of the > {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}} > class is exposed - {{Transformation}} also has the > {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}} > function which is not exposed. > > This means for any sink in the Flink pipeline, we cannot set a max > parallelism. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class
[ https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17715028#comment-17715028 ] Eric Xiao commented on FLINK-31873: --- Thanks [~martijnvisser] and [~luoyuxia], I will start with opening up a thread in the Dev mailing list before making a FLIP :). > Add setMaxParallelism to the DataStreamSink Class > - > > Key: FLINK-31873 > URL: https://issues.apache.org/jira/browse/FLINK-31873 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Reporter: Eric Xiao >Priority: Major > Attachments: Screenshot 2023-04-20 at 4.33.14 PM.png > > > When turning on Flink reactive mode, it is suggested to convert all > {{setParallelism}} calls to {{setMaxParallelism from the }}[elastic scaling > docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration]. > With the current implementation of the {{DataStreamSink}} class, only the > {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}} > function of the > {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}} > class is exposed - {{Transformation}} also has the > {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}} > function which is not exposed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class
[ https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17714745#comment-17714745 ] Eric Xiao commented on FLINK-31873: --- I have an open PR to address this issue: https://github.com/apache/flink/pull/22438 > Add setMaxParallelism to the DataStreamSink Class > - > > Key: FLINK-31873 > URL: https://issues.apache.org/jira/browse/FLINK-31873 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Reporter: Eric Xiao >Priority: Blocker > Attachments: Screenshot 2023-04-20 at 4.33.14 PM.png > > > When turning on Flink reactive mode, it is suggested to convert all > {{setParallelism}} calls to {{setMaxParallelism from the }}[elastic scaling > docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration]. > With the current implementation of the {{DataStreamSink}} class, only the > {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}} > function of the > {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}} > class is exposed - {{Transformation}} also has the > {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}} > function which is not exposed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class
[ https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Xiao updated FLINK-31873: -- Attachment: Screenshot 2023-04-20 at 4.33.14 PM.png > Add setMaxParallelism to the DataStreamSink Class > - > > Key: FLINK-31873 > URL: https://issues.apache.org/jira/browse/FLINK-31873 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Reporter: Eric Xiao >Priority: Blocker > Attachments: Screenshot 2023-04-20 at 4.33.14 PM.png > > > When turning on Flink reactive mode, it is suggested to convert all > {{setParallelism}} calls to {{setMaxParallelism from the }}[elastic scaling > docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration]. > With the current implementation of the {{DataStreamSink}} class, only the > {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}} > function of the > {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}} > class is exposed - {{Transformation}} also has the > {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}} > function which is not exposed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class
[ https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Xiao updated FLINK-31873: -- Description: When turning on Flink reactive mode, it is suggested to convert all {{setParallelism}} calls to {{setMaxParallelism from the }}[elastic scaling docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration]. With the current implementation of the {{DataStreamSink}} class, only the {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}} function of the {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}} class is exposed - {{Transformation}} also has the {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}} function which is not exposed. was: When turning on Flink reactive mode, it is suggested to convert all `setParallelism` calls to `setMaxParallelism`, [elastic scaling|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration]. With the current implementation of the `DataStreamSink`, only the [`setParallelism` function|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181] of the [`Transformation` class is exposed - `Transformation` also has the `setMaxParallelism` function which is not exposed|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]. > Add setMaxParallelism to the DataStreamSink Class > - > > Key: FLINK-31873 > URL: https://issues.apache.org/jira/browse/FLINK-31873 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Reporter: Eric Xiao >Priority: Blocker > > When turning on Flink reactive mode, it is suggested to convert all > {{setParallelism}} calls to {{setMaxParallelism from the }}[elastic scaling > docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration]. > With the current implementation of the {{DataStreamSink}} class, only the > {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}} > function of the > {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}} > class is exposed - {{Transformation}} also has the > {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}} > function which is not exposed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class
Eric Xiao created FLINK-31873: - Summary: Add setMaxParallelism to the DataStreamSink Class Key: FLINK-31873 URL: https://issues.apache.org/jira/browse/FLINK-31873 Project: Flink Issue Type: Bug Components: API / DataStream Reporter: Eric Xiao When turning on Flink reactive mode, it is suggested to convert all `setParallelism` calls to `setMaxParallelism`, [elastic scaling|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration]. With the current implementation of the `DataStreamSink`, only the [`setParallelism` function|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181] of the [`Transformation` class is exposed - `Transformation` also has the `setMaxParallelism` function which is not exposed|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29073) [FLIP-91] Support SQL Gateway(Part 2)
[ https://issues.apache.org/jira/browse/FLINK-29073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17637941#comment-17637941 ] Eric Xiao commented on FLINK-29073: --- Thanks [~yzl]! Yes let's connect after the FLIP is accepted as well :). Yes my team and I are interested in working on #5, is there some sort of timeline that the community had in mind to get these issues closed by? I will take a look at your PRs to see what is needed to expose the `completeStatement` to the REST API. > [FLIP-91] Support SQL Gateway(Part 2) > - > > Key: FLINK-29073 > URL: https://issues.apache.org/jira/browse/FLINK-29073 > Project: Flink > Issue Type: New Feature > Components: Table SQL / Client, Table SQL / Gateway >Affects Versions: 1.17.0 >Reporter: Shengkai Fang >Priority: Major > > Issue continues improving the SQL Gateway and allows the SQL Client submit > jobs to the SQL Gateway. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-21301) Decouple window aggregate allow lateness with state ttl configuration
[ https://issues.apache.org/jira/browse/FLINK-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636864#comment-17636864 ] Eric Xiao edited comment on FLINK-21301 at 11/21/22 8:16 PM: - Hi there, I had a read of [https://www.mail-archive.com/user@flink.apache.org/msg43316.html] and was wondering if there was any particular reason why the allow-lateness configuration is not enabled for Window TVF aggregations? As I saw that Group Window Aggregation is deprecated [1]. Would it be something hard to implement? We might have bandwidth on our team to work on it. [1] - https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/window-agg/#group-window-aggregation was (Author: JIRAUSER295489): Hi there, I had a read of [https://www.mail-archive.com/user@flink.apache.org/msg43316.html] and was wondering if there was any particular reason why the allow-lateness configuration is not enabled for Window TVF aggregations? > Decouple window aggregate allow lateness with state ttl configuration > - > > Key: FLINK-21301 > URL: https://issues.apache.org/jira/browse/FLINK-21301 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jing Zhang >Assignee: Jing Zhang >Priority: Major > Labels: auto-unassigned, pull-request-available > Fix For: 1.14.0 > > > Currently, state retention time config will also effect state clean behavior > of Window Aggregate, which is unexpected for most users. > E.g for the following example, User would set `MinIdleStateRetentionTime` to > 1 Day to clean state in `deduplicate` . However, it will also effects clean > behavior of window aggregate. For example, 2021-01-04 data would clean at > 2021-01-06 instead of 2021-01-05. > {code:sql} > SELECT > DATE_FORMAT(tumble_end(ROWTIME ,interval '1' DAY),'-MM-dd') as stat_time, > count(1) first_phone_num > FROM ( > SELECT > ROWTIME, > user_id, > row_number() over(partition by user_id, pdate order by ROWTIME ) as rn > FROM source_kafka_biz_shuidi_sdb_crm_call_record > ) cal > where rn =1 > group by tumble(ROWTIME,interval '1' DAY);{code} > It's better to decouple window aggregate allow lateness with > `MinIdleStateRetentionTime` . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-21301) Decouple window aggregate allow lateness with state ttl configuration
[ https://issues.apache.org/jira/browse/FLINK-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636864#comment-17636864 ] Eric Xiao commented on FLINK-21301: --- Hi there, I had a read of [https://www.mail-archive.com/user@flink.apache.org/msg43316.html] and was wondering if there was any particular reason why the allow-lateness configuration is not enabled for Window TVF aggregations? > Decouple window aggregate allow lateness with state ttl configuration > - > > Key: FLINK-21301 > URL: https://issues.apache.org/jira/browse/FLINK-21301 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jing Zhang >Assignee: Jing Zhang >Priority: Major > Labels: auto-unassigned, pull-request-available > Fix For: 1.14.0 > > > Currently, state retention time config will also effect state clean behavior > of Window Aggregate, which is unexpected for most users. > E.g for the following example, User would set `MinIdleStateRetentionTime` to > 1 Day to clean state in `deduplicate` . However, it will also effects clean > behavior of window aggregate. For example, 2021-01-04 data would clean at > 2021-01-06 instead of 2021-01-05. > {code:sql} > SELECT > DATE_FORMAT(tumble_end(ROWTIME ,interval '1' DAY),'-MM-dd') as stat_time, > count(1) first_phone_num > FROM ( > SELECT > ROWTIME, > user_id, > row_number() over(partition by user_id, pdate order by ROWTIME ) as rn > FROM source_kafka_biz_shuidi_sdb_crm_call_record > ) cal > where rn =1 > group by tumble(ROWTIME,interval '1' DAY);{code} > It's better to decouple window aggregate allow lateness with > `MinIdleStateRetentionTime` . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29073) [FLIP-91] Support SQL Gateway(Part 2)
[ https://issues.apache.org/jira/browse/FLINK-29073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636842#comment-17636842 ] Eric Xiao commented on FLINK-29073: --- Hi there, are any of the tasks above good for someone new to the community to start working on? > [FLIP-91] Support SQL Gateway(Part 2) > - > > Key: FLINK-29073 > URL: https://issues.apache.org/jira/browse/FLINK-29073 > Project: Flink > Issue Type: New Feature > Components: Table SQL / Client, Table SQL / Gateway >Affects Versions: 1.17.0 >Reporter: Shengkai Fang >Priority: Major > > Issue continues improving the SQL Gateway and allows the SQL Client submit > jobs to the SQL Gateway. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29837) SQL API does not expose the RowKind of the Row for processing Changelogs
[ https://issues.apache.org/jira/browse/FLINK-29837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17628369#comment-17628369 ] Eric Xiao commented on FLINK-29837: --- Hi [~luoyuxia] thanks for responding! We are exploring the Flink SQL API and have been translating some of our DataStream API pipelines to Flink SQL. In our pipelines we have business logic that depend on the kind of changelog event this particular row is associated with. I saw that there is the configuration parameter [1] that you can pass in `ChangelogMode` to the `fromChangelogStream`, but I don't think that addresses our needs. [1] https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/data_stream_api/#handling-of-changelog-streams > SQL API does not expose the RowKind of the Row for processing Changelogs > > > Key: FLINK-29837 > URL: https://issues.apache.org/jira/browse/FLINK-29837 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.16.0 >Reporter: Eric Xiao >Priority: Major > > When working with `{{{}ChangeLog{}}}` data in the SQL API it was a bit > misleading to see that the `{{{}op{}}}` column appears{^}[1]{^} the type of > in the table schema of print results but it is not available to be used in a > the SQL API: > {code:java} > val tableEnv = StreamTableEnvironment.create(env) > val dataStream = env.fromElements( > Row.ofKind(RowKind.INSERT, "Alice", Int.box(12)), > Row.ofKind(RowKind.INSERT, "Bob", Int.box(5)), > Row.ofKind(RowKind.UPDATE_AFTER, "Alice", Int.box(100)) > )(Types.ROW(Types.STRING, Types.INT)) > // interpret the DataStream as a Table > val table = > tableEnv.fromChangelogStream(dataStream, > Schema.newBuilder().primaryKey("f0").build(), ChangelogMode.upsert()) > // register the table under a name and perform an aggregation > tableEnv.createTemporaryView("InputTable", table) > tableEnv > .sqlQuery("SELECT * FROM InputTable where op = '+I'") > .execute() > .print() {code} > The error logs. > > > {code:java} > Exception in thread "main" org.apache.flink.table.api.ValidationException: > SQL validation failed. From line 1, column 32 to line 1, column 33: Column > 'op' not found in any table > at > org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:184) > at > org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:109) > at > org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:237) > at > org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:105) > at > org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:675){code} > It would be nice to expose the `op` column to be usable in the Flink SQL APIs > as it is in the DataStream APIs. > [1] > [https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/data_stream_api/#examples-for-fromchangelogstream] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29837) SQL API does not expose the RowKind of the Row for processing Changelogs
[ https://issues.apache.org/jira/browse/FLINK-29837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Xiao updated FLINK-29837: -- Description: When working with `{{{}ChangeLog{}}}` data in the SQL API it was a bit misleading to see that the `{{{}op{}}}` column appears{^}[1]{^} the type of in the table schema of print results but it is not available to be used in a the SQL API: {code:java} val tableEnv = StreamTableEnvironment.create(env) val dataStream = env.fromElements( Row.ofKind(RowKind.INSERT, "Alice", Int.box(12)), Row.ofKind(RowKind.INSERT, "Bob", Int.box(5)), Row.ofKind(RowKind.UPDATE_AFTER, "Alice", Int.box(100)) )(Types.ROW(Types.STRING, Types.INT)) // interpret the DataStream as a Table val table = tableEnv.fromChangelogStream(dataStream, Schema.newBuilder().primaryKey("f0").build(), ChangelogMode.upsert()) // register the table under a name and perform an aggregation tableEnv.createTemporaryView("InputTable", table) tableEnv .sqlQuery("SELECT * FROM InputTable where op = '+I'") .execute() .print() {code} The error logs. {code:java} Exception in thread "main" org.apache.flink.table.api.ValidationException: SQL validation failed. From line 1, column 32 to line 1, column 33: Column 'op' not found in any table at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:184) at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:109) at org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:237) at org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:105) at org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:675){code} It would be nice to expose the `op` column to be usable in the Flink SQL APIs as it is in the DataStream APIs. [1] [https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/data_stream_api/#examples-for-fromchangelogstream] was: When working with `{{{}ChangeLog{}}}` data in the SQL API it was a bit misleading to see that the `{{{}op{}}}` column appears{^}[1]{^} the type of in the table schema of print results but it is not available to be used in a the SQL API: {code:java} val tableEnv = StreamTableEnvironment.create(env) val dataStream = env.fromElements( Row.ofKind(RowKind.INSERT, "Alice", Int.box(12)), Row.ofKind(RowKind.INSERT, "Bob", Int.box(5)), Row.ofKind(RowKind.UPDATE_AFTER, "Alice", Int.box(100)) )(Types.ROW(Types.STRING, Types.INT)) // interpret the DataStream as a Table val table = tableEnv.fromChangelogStream(dataStream, Schema.newBuilder().primaryKey("f0").build(), ChangelogMode.upsert()) // register the table under a name and perform an aggregation tableEnv.createTemporaryView("InputTable", table) tableEnv .sqlQuery("SELECT * FROM InputTable where op = '+I'") .execute() .print() {code} The error logs. {code:java} Exception in thread "main" org.apache.flink.table.api.ValidationException: SQL validation failed. From line 1, column 32 to line 1, column 33: Column 'op' not found in any table at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:184) at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:109) at org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:237) at org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:105) at org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:675) at com.shopify.trickle.pipelines.IteratorPipeline$.delayedEndpoint$com$shopify$trickle$pipelines$IteratorPipeline$1(IteratorPipeline.scala:32) at com.shopify.trickle.pipelines.IteratorPipeline$delayedInit$body.apply(IteratorPipeline.scala:11) at scala.Function0.apply$mcV$sp(Function0.scala:39) at scala.Function0.apply$mcV$sp$(Function0.scala:39) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17) at scala.App.$anonfun$main$1$adapted(App.scala:80) at scala.collection.immutable.List.foreach(List.scala:431) at scala.App.main(App.scala:80) at scala.App.main$(App.scala:78) at com.shopify.trickle.pipelines.IteratorPipeline$.main(IteratorPipeline.scala:11) at com.shopify.trickle.pipelines.IteratorPipeline.main(IteratorPipeline.scala) {code} It would be nice to expose the `op` column to be usable in the Flink SQL APIs as it is in the DataStream APIs. [1] [https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/data_stream_api/#examples-for-fromchangelogstream] > SQL API does not expose the RowKind of the Row for processing Changelogs >
[jira] [Created] (FLINK-29837) SQL API does not expose the RowKind of the Row for processing Changelogs
Eric Xiao created FLINK-29837: - Summary: SQL API does not expose the RowKind of the Row for processing Changelogs Key: FLINK-29837 URL: https://issues.apache.org/jira/browse/FLINK-29837 Project: Flink Issue Type: Bug Components: Table SQL / API Affects Versions: 1.16.0 Reporter: Eric Xiao When working with `{{{}ChangeLog{}}}` data in the SQL API it was a bit misleading to see that the `{{{}op{}}}` column appears{^}[1]{^} the type of in the table schema of print results but it is not available to be used in a the SQL API: {code:java} val tableEnv = StreamTableEnvironment.create(env) val dataStream = env.fromElements( Row.ofKind(RowKind.INSERT, "Alice", Int.box(12)), Row.ofKind(RowKind.INSERT, "Bob", Int.box(5)), Row.ofKind(RowKind.UPDATE_AFTER, "Alice", Int.box(100)) )(Types.ROW(Types.STRING, Types.INT)) // interpret the DataStream as a Table val table = tableEnv.fromChangelogStream(dataStream, Schema.newBuilder().primaryKey("f0").build(), ChangelogMode.upsert()) // register the table under a name and perform an aggregation tableEnv.createTemporaryView("InputTable", table) tableEnv .sqlQuery("SELECT * FROM InputTable where op = '+I'") .execute() .print() {code} The error logs. {code:java} Exception in thread "main" org.apache.flink.table.api.ValidationException: SQL validation failed. From line 1, column 32 to line 1, column 33: Column 'op' not found in any table at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:184) at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:109) at org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:237) at org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:105) at org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:675) at com.shopify.trickle.pipelines.IteratorPipeline$.delayedEndpoint$com$shopify$trickle$pipelines$IteratorPipeline$1(IteratorPipeline.scala:32) at com.shopify.trickle.pipelines.IteratorPipeline$delayedInit$body.apply(IteratorPipeline.scala:11) at scala.Function0.apply$mcV$sp(Function0.scala:39) at scala.Function0.apply$mcV$sp$(Function0.scala:39) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17) at scala.App.$anonfun$main$1$adapted(App.scala:80) at scala.collection.immutable.List.foreach(List.scala:431) at scala.App.main(App.scala:80) at scala.App.main$(App.scala:78) at com.shopify.trickle.pipelines.IteratorPipeline$.main(IteratorPipeline.scala:11) at com.shopify.trickle.pipelines.IteratorPipeline.main(IteratorPipeline.scala) {code} It would be nice to expose the `op` column to be usable in the Flink SQL APIs as it is in the DataStream APIs. [1] [https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/data_stream_api/#examples-for-fromchangelogstream] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]
[ https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154 ] Eric Xiao edited comment on FLINK-20578 at 10/26/22 7:24 PM: - Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue? If not I would love to take over. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I think what we can do is: # (Look below for examples from other databases) Set a default data type when the empty array is created: Two options for the default data type: ## Use or create a new empty/void data type. ## Use an existing data type i.e. Integer. # Teach operations such as `COALESCE` how to type coercion from the default data type to the desired data type (a separate PR). I think Step 1 is sufficient enough to unblock users in creating an empty array for the time being but Step 2 is required to allow seamless SQL experience. Without step two users will most likely have to manually convert their empty array's data type. With step 1: {code:java} SELECT COALESCE(int_column, CAST(ARRAY[] as INT)){code} With step 1 and 2: {code:java} SELECT COALESCE(int_column, ARRAY[]) {code} *Default Data Type* I believe the query in the issue would qualify as a query with no context. I tested in other query engines and these are the results I got: Trino: !Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340! They use unknown datatype Spark: !image-2022-10-26-14-42-08-468.png|width=505,height=112! They use unknown datatype BigQuery: !Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158! !image-2022-10-26-14-42-57-579.png|width=373,height=125! They use integer datatype was (Author: JIRAUSER295489): Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue? If not I would love to take over. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I think what we can do is: # (Look below for examples from other databases) Set a default data type when the empty array is created: Two options for the default data type: ## Use or create a new empty/void data type. ## Use an existing data type i.e. Integer. # Teach operations such as `COALESCE` how to type coercion from the default data type to the desired data type (a separate PR). Default Data Type I believe the query in the issue would qualify as a query with no context. I tested in other query engines and these are the results I got: Trino: !Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340! They use unknown datatype Spark: !image-2022-10-26-14-42-08-468.png|width=505,height=112! They use unknown datatype BigQuery: !Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158! !image-2022-10-26-14-42-57-579.png|width=373,height=125! They use integer datatype > Cannot create empty array using ARRAY[] > --- > > Key: FLINK-20578 > URL: https://issues.apache.org/jira/browse/FLINK-20578 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.11.2 >Reporter: Fabian Hueske >Priority: Major > Labels: pull-request-available, starter > Fix For: 1.17.0 > > Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot > 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png, > Screen Shot 2022-10-26 at 2.28.49 PM.png, image-2022-10-26-14-42-08-468.png, > image-2022-10-26-14-42-57-579.png > > > Calling the ARRAY function without an element (`ARRAY[]`) results in an error > message. > Is that the expected behavior? > How can users create empty arrays? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]
[ https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154 ] Eric Xiao edited comment on FLINK-20578 at 10/26/22 7:20 PM: - Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue? If not I would love to take over. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I think what we can do is: # (Look below for examples from other databases) Set a default data type when the empty array is created: Two options for the default data type: ## Use or create a new empty/void data type. ## Use an existing data type i.e. Integer. # Teach operations such as `COALESCE` how to type coercion from the default data type to the desired data type (a separate PR). Default Data Type I believe the query in the issue would qualify as a query with no context. I tested in other query engines and these are the results I got: Trino: !Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340! They use unknown datatype Spark: !image-2022-10-26-14-42-08-468.png|width=505,height=112! They use unknown datatype BigQuery: !Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158! !image-2022-10-26-14-42-57-579.png|width=373,height=125! They use integer datatype was (Author: JIRAUSER295489): Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue? If not I would love to take over. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I think there are two paths: 1. If we given more context on what the array type should be we should try using that. 2. If we have no context we use a default data type. Path #1 - I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where we could infer the data should be of string type and try to return that. Path #2 - Default Data Type I believe the query in the issue would qualify as a query with no context. I tested in other query engines and these are the results I got: Trino: !Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340! They use unknown datatype Spark: !image-2022-10-26-14-42-08-468.png|width=505,height=112! They use unknown datatype BigQuery: !Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158! !image-2022-10-26-14-42-57-579.png|width=373,height=125! They use integer datatype > Cannot create empty array using ARRAY[] > --- > > Key: FLINK-20578 > URL: https://issues.apache.org/jira/browse/FLINK-20578 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.11.2 >Reporter: Fabian Hueske >Priority: Major > Labels: pull-request-available, starter > Fix For: 1.17.0 > > Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot > 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png, > Screen Shot 2022-10-26 at 2.28.49 PM.png, image-2022-10-26-14-42-08-468.png, > image-2022-10-26-14-42-57-579.png > > > Calling the ARRAY function without an element (`ARRAY[]`) results in an error > message. > Is that the expected behavior? > How can users create empty arrays? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]
[ https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154 ] Eric Xiao edited comment on FLINK-20578 at 10/26/22 6:43 PM: - Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue? If not I would love to take over. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I think there are two paths: 1. If we given more context on what the array type should be we should try using that. 2. If we have no context we use a default data type. Path #1 - I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where we could infer the data should be of string type and try to return that. Path #2 - Default Data Type I believe the query in the issue would qualify as a query with no context. I tested in other query engines and these are the results I got: Trino: !Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340! They use unknown datatype Spark: !image-2022-10-26-14-42-08-468.png|width=505,height=112! They use unknown datatype BigQuery: !Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158! !image-2022-10-26-14-42-57-579.png|width=373,height=125! They use integer datatype was (Author: JIRAUSER295489): Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue? If not I would love to take over. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I think there are two paths: 1. If we given more context on what the array type should be we should try using that. 2. If we have no context we use a default data type. Path #1 - I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where we could infer the data should be of string type and try to return that. Path #2 - Default Data Type I believe the query in the issue would qualify as a query with no context. I tested in other query engines and these are the results I got: Trino: !Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340! They use unknown datatype BigQuery: !Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!!Screen Shot 2022-10-25 at 10.50.42 PM.png|width=761,height=148! They use integer datatype > Cannot create empty array using ARRAY[] > --- > > Key: FLINK-20578 > URL: https://issues.apache.org/jira/browse/FLINK-20578 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.11.2 >Reporter: Fabian Hueske >Priority: Major > Labels: pull-request-available, starter > Fix For: 1.17.0 > > Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot > 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png, > Screen Shot 2022-10-26 at 2.28.49 PM.png, image-2022-10-26-14-42-08-468.png, > image-2022-10-26-14-42-57-579.png > > > Calling the ARRAY function without an element (`ARRAY[]`) results in an error > message. > Is that the expected behavior? > How can users create empty arrays? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]
[ https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154 ] Eric Xiao edited comment on FLINK-20578 at 10/26/22 6:29 PM: - Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue? If not I would love to take over. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I think there are two paths: 1. If we given more context on what the array type should be we should try using that. 2. If we have no context we use a default data type. Path #1 - I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where we could infer the data should be of string type and try to return that. Path #2 - Default Data Type I believe the query in the issue would qualify as a query with no context. I tested in other query engines and these are the results I got: Trino: !Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340! They use unknown datatype BigQuery: !Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!!Screen Shot 2022-10-25 at 10.50.42 PM.png|width=761,height=148! They use integer datatype was (Author: JIRAUSER295489): Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue? If not I would love to take over. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I think there are two paths: 1. If we given more context on what the array type should be we should try using that. 2. If we have no context we use a default data type. Path #1 - I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where we could infer the data should be of string type and try to return that. Path #2 - Default Data Type I believe the query in the issue would qualify as a query with no context. I tested in other query engines and these are the results I got: Trino: They use unknown datatype BigQuery: They use integer datatype a similar query in Trino (Presto) and BigQuery and they use a data type Integer as the data type. This could be a good default behaviour? !Screen Shot 2022-10-25 at 10.50.42 PM.png! !Screen Shot 2022-10-25 at 10.50.47 PM.png! !Screen Shot 2022-10-25 at 11.01.06 PM.png! > Cannot create empty array using ARRAY[] > --- > > Key: FLINK-20578 > URL: https://issues.apache.org/jira/browse/FLINK-20578 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.11.2 >Reporter: Fabian Hueske >Priority: Major > Labels: pull-request-available, starter > Fix For: 1.17.0 > > Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot > 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png, > Screen Shot 2022-10-26 at 2.28.49 PM.png > > > Calling the ARRAY function without an element (`ARRAY[]`) results in an error > message. > Is that the expected behavior? > How can users create empty arrays? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]
[ https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154 ] Eric Xiao edited comment on FLINK-20578 at 10/26/22 6:27 PM: - Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue? If not I would love to take over. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I think there are two paths: 1. If we given more context on what the array type should be we should try using that. 2. If we have no context we use a default data type. Path #1 - I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where we could infer the data should be of string type and try to return that. Path #2 - Default Data Type I believe the query in the issue would qualify as a query with no context. I tested in other query engines and these are the results I got: Trino: They use unknown datatype BigQuery: They use integer datatype a similar query in Trino (Presto) and BigQuery and they use a data type Integer as the data type. This could be a good default behaviour? !Screen Shot 2022-10-25 at 10.50.42 PM.png! !Screen Shot 2022-10-25 at 10.50.47 PM.png! !Screen Shot 2022-10-25 at 11.01.06 PM.png! was (Author: JIRAUSER295489): Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue? If not I would love to take over. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I think there are two paths: 1. If we given more context on what the array type should be we should try using that. 2. If we have no context we use a default data type. Path #1 - I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where we could infer the data should be of string type and try to return that. Path #2 - Default Data Type I believe the query in the issue would qualify as a query with no context. I tested a similar query in Trino (Presto) and BigQuery and they use a data type Integer as the data type. This could be a good default behaviour? !Screen Shot 2022-10-25 at 10.50.42 PM.png! !Screen Shot 2022-10-25 at 10.50.47 PM.png! !Screen Shot 2022-10-25 at 11.01.06 PM.png! > Cannot create empty array using ARRAY[] > --- > > Key: FLINK-20578 > URL: https://issues.apache.org/jira/browse/FLINK-20578 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.11.2 >Reporter: Fabian Hueske >Priority: Major > Labels: pull-request-available, starter > Fix For: 1.17.0 > > Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot > 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png > > > Calling the ARRAY function without an element (`ARRAY[]`) results in an error > message. > Is that the expected behavior? > How can users create empty arrays? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]
[ https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154 ] Eric Xiao edited comment on FLINK-20578 at 10/26/22 4:19 AM: - Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue? If not I would love to take over. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I think there are two paths: 1. If we given more context on what the array type should be we should try using that. 2. If we have no context we use a default data type. Path #1 - I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where we could infer the data should be of string type and try to return that. Path #2 - Default Data Type I believe the query in the issue would qualify as a query with no context. I tested a similar query in Trino (Presto) and BigQuery and they use a data type Integer as the data type. This could be a good default behaviour? !Screen Shot 2022-10-25 at 10.50.42 PM.png! !Screen Shot 2022-10-25 at 10.50.47 PM.png! !Screen Shot 2022-10-25 at 11.01.06 PM.png! was (Author: JIRAUSER295489): Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue, I noticed it has been a year since your last comment. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I think there are two paths: 1. If we given more context on what the array type should be we should try using that. 2. If we have no context we use a default data type. Path #1 - I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where we could infer the data should be of string type and try to return that. Path #2 - Default Data Type I believe the query in the issue would qualify as a query with no context. I tested a similar query in Trino (Presto) and BigQuery and they use a data type Integer as the data type. This could be a good default behaviour? !Screen Shot 2022-10-25 at 10.50.42 PM.png! !Screen Shot 2022-10-25 at 10.50.47 PM.png! !Screen Shot 2022-10-25 at 11.01.06 PM.png! > Cannot create empty array using ARRAY[] > --- > > Key: FLINK-20578 > URL: https://issues.apache.org/jira/browse/FLINK-20578 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.11.2 >Reporter: Fabian Hueske >Priority: Major > Labels: pull-request-available, starter > Fix For: 1.17.0 > > Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot > 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png > > > Calling the ARRAY function without an element (`ARRAY[]`) results in an error > message. > Is that the expected behavior? > How can users create empty arrays? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]
[ https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154 ] Eric Xiao edited comment on FLINK-20578 at 10/26/22 3:53 AM: - Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue, I noticed it has been a year since your last comment. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I think there are two paths: 1. If we given more context on what the array type should be we should try using that. 2. If we have no context we use a default data type. Path #1 - I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where we could infer the data should be of string type and try to return that. Path #2 - Default Data Type I believe the query in the issue would qualify as a query with no context. I tested a similar query in Trino (Presto) and BigQuery and they use a data type Integer as the data type. This could be a good default behaviour? !Screen Shot 2022-10-25 at 10.50.42 PM.png! !Screen Shot 2022-10-25 at 10.50.47 PM.png! !Screen Shot 2022-10-25 at 11.01.06 PM.png! was (Author: JIRAUSER295489): Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue, I noticed it has been a year since your last comment. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I tested a similar query in Trino (Presto) and BigQuery and they by default use Integer as the data type. This could be a good default behaviour? !Screen Shot 2022-10-25 at 10.50.42 PM.png! !Screen Shot 2022-10-25 at 10.50.47 PM.png! !Screen Shot 2022-10-25 at 11.01.06 PM.png! > Cannot create empty array using ARRAY[] > --- > > Key: FLINK-20578 > URL: https://issues.apache.org/jira/browse/FLINK-20578 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.11.2 >Reporter: Fabian Hueske >Priority: Major > Labels: starter > Fix For: 1.17.0 > > Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot > 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png > > > Calling the ARRAY function without an element (`ARRAY[]`) results in an error > message. > Is that the expected behavior? > How can users create empty arrays? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]
[ https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154 ] Eric Xiao edited comment on FLINK-20578 at 10/26/22 3:04 AM: - Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue, I noticed it has been a year since your last comment. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I tested a similar query in Trino (Presto) and BigQuery and they by default use Integer as the data type. This could be a good default behaviour? !Screen Shot 2022-10-25 at 10.50.42 PM.png! !Screen Shot 2022-10-25 at 10.50.47 PM.png! !Screen Shot 2022-10-25 at 11.01.06 PM.png! was (Author: JIRAUSER295489): Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue, I noticed it has been a year since your last comment. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I tested a similar query in Trino (Presto) and BigQuery and they by default use Integer as the data type. This could be a good default behaviour? !Screen Shot 2022-10-25 at 10.50.42 PM.png! > Cannot create empty array using ARRAY[] > --- > > Key: FLINK-20578 > URL: https://issues.apache.org/jira/browse/FLINK-20578 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.11.2 >Reporter: Fabian Hueske >Priority: Major > Labels: starter > Fix For: 1.17.0 > > Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot > 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png > > > Calling the ARRAY function without an element (`ARRAY[]`) results in an error > message. > Is that the expected behavior? > How can users create empty arrays? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-20578) Cannot create empty array using ARRAY[]
[ https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154 ] Eric Xiao commented on FLINK-20578: --- Hi I wanted to get more involved in contributing to the Flink project and found this starter task - my team is working with the Table / SQL APIs, so I thought this would be a good beginning task to work on :). [~surahman] are you still working on this issue, I noticed it has been a year since your last comment. > If Flink support empty array, which data type of elements in array should be > ? Does it cause new problems. [~pensz] I tested a similar query in Trino (Presto) and BigQuery and they by default use Integer as the data type. This could be a good default behaviour? !Screen Shot 2022-10-25 at 10.50.42 PM.png! > Cannot create empty array using ARRAY[] > --- > > Key: FLINK-20578 > URL: https://issues.apache.org/jira/browse/FLINK-20578 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.11.2 >Reporter: Fabian Hueske >Priority: Major > Labels: starter > Fix For: 1.17.0 > > Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot > 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png > > > Calling the ARRAY function without an element (`ARRAY[]`) results in an error > message. > Is that the expected behavior? > How can users create empty arrays? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-20578) Cannot create empty array using ARRAY[]
[ https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Xiao updated FLINK-20578: -- Attachment: Screen Shot 2022-10-25 at 10.50.42 PM.png Screen Shot 2022-10-25 at 10.50.47 PM.png Screen Shot 2022-10-25 at 11.01.06 PM.png > Cannot create empty array using ARRAY[] > --- > > Key: FLINK-20578 > URL: https://issues.apache.org/jira/browse/FLINK-20578 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.11.2 >Reporter: Fabian Hueske >Priority: Major > Labels: starter > Fix For: 1.17.0 > > Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot > 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png > > > Calling the ARRAY function without an element (`ARRAY[]`) results in an error > message. > Is that the expected behavior? > How can users create empty arrays? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
[ https://issues.apache.org/jira/browse/FLINK-29498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621332#comment-17621332 ] Eric Xiao commented on FLINK-29498: --- [~gaoyunhaii], I believe I have a [PR|https://github.com/apache/flink/pull/21077] that is in a reviewable state, what is the process of getting folks from the community to review the PR? > Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API > -- > > Key: FLINK-29498 > URL: https://issues.apache.org/jira/browse/FLINK-29498 > Project: Flink > Issue Type: Bug > Components: API / Scala >Affects Versions: 1.15.2 >Reporter: Eric Xiao >Assignee: Eric Xiao >Priority: Minor > > We are using the async I/O to make HTTP calls and one of the features we > wanted to leverage was the retries, so we pulled the newest commit: > [http://github.com/apache/flink/pull/19983] into our internal Flink fork. > When I try calling the function {{AsyncDataStream.unorderedWaitWithRetry}} > from the scala API I with a retry strategy from the java API I get an error > as {{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is > that retry strategies were only implemented in java and not Scala in this PR: > [http://github.com/apache/flink/pull/19983]. > > Here is some of the code to reproduce the error: > {code:java} > import org.apache.flink.streaming.api.scala.AsyncDataStream > import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => > JAsyncRetryStrategies} > val javaAsyncRetryStrategy = new > JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L) > .build() > val data = AsyncDataStream.unorderedWaitWithRetry( > source, > asyncOperator, > pipelineTimeoutInMs, > TimeUnit.MILLISECONDS, > javaAsyncRetryStrategy > ){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
[ https://issues.apache.org/jira/browse/FLINK-29498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621332#comment-17621332 ] Eric Xiao edited comment on FLINK-29498 at 10/20/22 7:30 PM: - Hi [~gaoyunhaii], I believe I have a [PR|https://github.com/apache/flink/pull/21077] that is in a reviewable state, what is the process of getting folks from the community to review the PR? was (Author: JIRAUSER295489): [~gaoyunhaii], I believe I have a [PR|https://github.com/apache/flink/pull/21077] that is in a reviewable state, what is the process of getting folks from the community to review the PR? > Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API > -- > > Key: FLINK-29498 > URL: https://issues.apache.org/jira/browse/FLINK-29498 > Project: Flink > Issue Type: Bug > Components: API / Scala >Affects Versions: 1.15.2 >Reporter: Eric Xiao >Assignee: Eric Xiao >Priority: Minor > > We are using the async I/O to make HTTP calls and one of the features we > wanted to leverage was the retries, so we pulled the newest commit: > [http://github.com/apache/flink/pull/19983] into our internal Flink fork. > When I try calling the function {{AsyncDataStream.unorderedWaitWithRetry}} > from the scala API I with a retry strategy from the java API I get an error > as {{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is > that retry strategies were only implemented in java and not Scala in this PR: > [http://github.com/apache/flink/pull/19983]. > > Here is some of the code to reproduce the error: > {code:java} > import org.apache.flink.streaming.api.scala.AsyncDataStream > import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => > JAsyncRetryStrategies} > val javaAsyncRetryStrategy = new > JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L) > .build() > val data = AsyncDataStream.unorderedWaitWithRetry( > source, > asyncOperator, > pipelineTimeoutInMs, > TimeUnit.MILLISECONDS, > javaAsyncRetryStrategy > ){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-25595) Specify hash/sort aggregate strategy in SQL hint
[ https://issues.apache.org/jira/browse/FLINK-25595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621193#comment-17621193 ] Eric Xiao commented on FLINK-25595: --- Hi [~godfreyhe] and [~jingzhang], I would like to get more involved in the Flink community, how can someone new to the code get familiar with this part of the code base? I see that there is still an open issue that is related to this one. I am familiar with the (using this word lightly) on how query optimizations work through using other systems like Spark and Trino. > Specify hash/sort aggregate strategy in SQL hint > > > Key: FLINK-25595 > URL: https://issues.apache.org/jira/browse/FLINK-25595 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: Jing Zhang >Assignee: ZhuoYu Chen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
[ https://issues.apache.org/jira/browse/FLINK-29498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17618378#comment-17618378 ] Eric Xiao commented on FLINK-29498: --- As discussed in the [slack thread|https://apache-flink.slack.com/archives/C03G7LJTS2G/p1663957561957419] I am eager to contribute the Scala util classes :). Should we assign the Jira ticket to me (Not sure how to do that)? > Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API > -- > > Key: FLINK-29498 > URL: https://issues.apache.org/jira/browse/FLINK-29498 > Project: Flink > Issue Type: Bug > Components: API / Scala >Affects Versions: 1.15.2 >Reporter: Eric Xiao >Priority: Minor > > We are using the async I/O to make HTTP calls and one of the features we > wanted to leverage was the retries, so we pulled the newest commit: > [http://github.com/apache/flink/pull/19983] into our internal Flink fork. > When I try calling the function {{AsyncDataStream.unorderedWaitWithRetry}} > from the scala API I with a retry strategy from the java API I get an error > as {{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is > that retry strategies were only implemented in java and not Scala in this PR: > [http://github.com/apache/flink/pull/19983]. > > Here is some of the code to reproduce the error: > {code:java} > import org.apache.flink.streaming.api.scala.AsyncDataStream > import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => > JAsyncRetryStrategies} > val javaAsyncRetryStrategy = new > JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L) > .build() > val data = AsyncDataStream.unorderedWaitWithRetry( > source, > asyncOperator, > pipelineTimeoutInMs, > TimeUnit.MILLISECONDS, > javaAsyncRetryStrategy > ){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
[ https://issues.apache.org/jira/browse/FLINK-29498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17612625#comment-17612625 ] Eric Xiao edited comment on FLINK-29498 at 10/4/22 2:03 PM: > For your case, you can try to implement your own `AsyncRetryStrategy` to > enable retries. Thanks [~lincoln.86xy] for replying :), this is what we have done so far, thankfully the code is not a lot but we were wondering if there was a reason why the `AsyncRetryStrategies` are only available in Java API and not the Scala API? Similar for the `RetryPredicates`? I am fairly new to Flink, but I believe I have seem some `.toJava` and `.toScala` helper methods on other Flink components and was wondering if there is room to add the same such functionality to the retry strategies builder and retry predicates? was (Author: JIRAUSER295489): [~lincoln.86xy], is there a reason why the `AsyncRetryStrategies` are only available in Java API and not the Scala API? Similar for the `RetryPredicates`? I am fairly new to Flink, but I believe I have seem some `.toJava` and `.toScala` helper methods on other Flink components and was wondering if there is room to add the same such functionality to the retry strategies builder and retry predicates? > Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API > -- > > Key: FLINK-29498 > URL: https://issues.apache.org/jira/browse/FLINK-29498 > Project: Flink > Issue Type: Bug > Components: API / Scala >Affects Versions: 1.15.2 >Reporter: Eric Xiao >Priority: Minor > > We are using the async I/O to make HTTP calls and one of the features we > wanted to leverage was the retries, so we pulled the newest commit: > [http://github.com/apache/flink/pull/19983] into our internal Flink fork. > When I try calling the function {{AsyncDataStream.unorderedWaitWithRetry}} > from the scala API I with a retry strategy from the java API I get an error > as {{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is > that retry strategies were only implemented in java and not Scala in this PR: > [http://github.com/apache/flink/pull/19983]. > > Here is some of the code to reproduce the error: > {code:java} > import org.apache.flink.streaming.api.scala.AsyncDataStream > import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => > JAsyncRetryStrategies} > val javaAsyncRetryStrategy = new > JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L) > .build() > val data = AsyncDataStream.unorderedWaitWithRetry( > source, > asyncOperator, > pipelineTimeoutInMs, > TimeUnit.MILLISECONDS, > javaAsyncRetryStrategy > ){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
[ https://issues.apache.org/jira/browse/FLINK-29498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17612625#comment-17612625 ] Eric Xiao commented on FLINK-29498: --- [~lincoln.86xy], is there a reason why the `AsyncRetryStrategies` are only available in Java API and not the Scala API? Similar for the `RetryPredicates`? I am fairly new to Flink, but I believe I have seem some `.toJava` and `.toScala` helper methods on other Flink components and was wondering if there is room to add the same such functionality to the retry strategies builder and retry predicates? > Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API > -- > > Key: FLINK-29498 > URL: https://issues.apache.org/jira/browse/FLINK-29498 > Project: Flink > Issue Type: Bug > Components: API / Scala >Affects Versions: 1.15.2 >Reporter: Eric Xiao >Priority: Minor > > We are using the async I/O to make HTTP calls and one of the features we > wanted to leverage was the retries, so we pulled the newest commit: > [http://github.com/apache/flink/pull/19983] into our internal Flink fork. > When I try calling the function {{AsyncDataStream.unorderedWaitWithRetry}} > from the scala API I with a retry strategy from the java API I get an error > as {{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is > that retry strategies were only implemented in java and not Scala in this PR: > [http://github.com/apache/flink/pull/19983]. > > Here is some of the code to reproduce the error: > {code:java} > import org.apache.flink.streaming.api.scala.AsyncDataStream > import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => > JAsyncRetryStrategies} > val javaAsyncRetryStrategy = new > JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L) > .build() > val data = AsyncDataStream.unorderedWaitWithRetry( > source, > asyncOperator, > pipelineTimeoutInMs, > TimeUnit.MILLISECONDS, > javaAsyncRetryStrategy > ){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
[ https://issues.apache.org/jira/browse/FLINK-29498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Xiao updated FLINK-29498: -- Description: We are using the async I/O to make HTTP calls and one of the features we wanted to leverage was the retries, so we pulled the newest commit: [http://github.com/apache/flink/pull/19983] into our internal Flink fork. When I try calling the function {{AsyncDataStream.unorderedWaitWithRetry}} from the scala API I with a retry strategy from the java API I get an error as {{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is that retry strategies were only implemented in java and not Scala in this PR: [http://github.com/apache/flink/pull/19983]. Here is some of the code to reproduce the error: {code:java} import org.apache.flink.streaming.api.scala.AsyncDataStream import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => JAsyncRetryStrategies} val javaAsyncRetryStrategy = new JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L) .build() val data = AsyncDataStream.unorderedWaitWithRetry( source, asyncOperator, pipelineTimeoutInMs, TimeUnit.MILLISECONDS, javaAsyncRetryStrategy ){code} was: When I try calling the function \{{AsyncDataStream.unorderedWaitWithRetry}} from the scala API I with a retry strategy from the java API I get an error as \{{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is that retry strategies were only implemented in java and not Scala in this PR: [http://github.com/apache/flink/pull/19983]. {code:java} import org.apache.flink.streaming.api.scala.AsyncDataStream import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => JAsyncRetryStrategies} val javaAsyncRetryStrategy = new JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L) .build() val data = AsyncDataStream.unorderedWaitWithRetry( source, asyncOperator, pipelineTimeoutInMs, TimeUnit.MILLISECONDS, javaAsyncRetryStrategy ){code} > Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API > -- > > Key: FLINK-29498 > URL: https://issues.apache.org/jira/browse/FLINK-29498 > Project: Flink > Issue Type: Bug > Components: API / Scala >Affects Versions: 1.15.2 >Reporter: Eric Xiao >Priority: Minor > > We are using the async I/O to make HTTP calls and one of the features we > wanted to leverage was the retries, so we pulled the newest commit: > [http://github.com/apache/flink/pull/19983] into our internal Flink fork. > When I try calling the function {{AsyncDataStream.unorderedWaitWithRetry}} > from the scala API I with a retry strategy from the java API I get an error > as {{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is > that retry strategies were only implemented in java and not Scala in this PR: > [http://github.com/apache/flink/pull/19983]. > > Here is some of the code to reproduce the error: > {code:java} > import org.apache.flink.streaming.api.scala.AsyncDataStream > import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => > JAsyncRetryStrategies} > val javaAsyncRetryStrategy = new > JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L) > .build() > val data = AsyncDataStream.unorderedWaitWithRetry( > source, > asyncOperator, > pipelineTimeoutInMs, > TimeUnit.MILLISECONDS, > javaAsyncRetryStrategy > ){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
[ https://issues.apache.org/jira/browse/FLINK-29498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Xiao updated FLINK-29498: -- Description: When I try calling the function \{{AsyncDataStream.unorderedWaitWithRetry}} from the scala API I with a retry strategy from the java API I get an error as \{{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is that retry strategies were only implemented in java and not Scala in this PR: [http://github.com/apache/flink/pull/19983]. {code:java} import org.apache.flink.streaming.api.scala.AsyncDataStream import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => JAsyncRetryStrategies} val javaAsyncRetryStrategy = new JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L) .build() val data = AsyncDataStream.unorderedWaitWithRetry( source, asyncOperator, pipelineTimeoutInMs, TimeUnit.MILLISECONDS, javaAsyncRetryStrategy ){code} was: When I try calling the function `AsyncDataStream.unorderedWaitWithRetry` from the scala API I with a retry strategy from the java API I get an error as `unorderedWaitWithRetry` expects a scala retry strategy. The problem is that retry strategies were only implemented in java and not Scala in this PR: http://github.com/apache/flink/pull/19983. {code:java} import org.apache.flink.streaming.api.scala.AsyncDataStream import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => JAsyncRetryStrategies} val javaAsyncRetryStrategy = new JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L) .build() val data = AsyncDataStream.unorderedWaitWithRetry( source, asyncOperator, pipelineTimeoutInMs, TimeUnit.MILLISECONDS, javaAsyncRetryStrategy ){code} > Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API > -- > > Key: FLINK-29498 > URL: https://issues.apache.org/jira/browse/FLINK-29498 > Project: Flink > Issue Type: Bug > Components: API / Scala >Affects Versions: 1.15.2 >Reporter: Eric Xiao >Priority: Minor > > When I try calling the function \{{AsyncDataStream.unorderedWaitWithRetry}} > from the scala API I with a retry strategy from the java API I get an error > as \{{unorderedWaitWithRetry}} expects a scala retry strategy. The problem is > that retry strategies were only implemented in java and not Scala in this PR: > [http://github.com/apache/flink/pull/19983]. > {code:java} > import org.apache.flink.streaming.api.scala.AsyncDataStream > import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => > JAsyncRetryStrategies} > val javaAsyncRetryStrategy = new > JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L) > .build() > val data = AsyncDataStream.unorderedWaitWithRetry( > source, > asyncOperator, > pipelineTimeoutInMs, > TimeUnit.MILLISECONDS, > javaAsyncRetryStrategy > ){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29498) Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API
Eric Xiao created FLINK-29498: - Summary: Flink Async I/O Retry Strategies Do Not Work for Scala AsyncDataStream API Key: FLINK-29498 URL: https://issues.apache.org/jira/browse/FLINK-29498 Project: Flink Issue Type: Bug Components: API / Scala Affects Versions: 1.15.2 Reporter: Eric Xiao When I try calling the function `AsyncDataStream.unorderedWaitWithRetry` from the scala API I with a retry strategy from the java API I get an error as `unorderedWaitWithRetry` expects a scala retry strategy. The problem is that retry strategies were only implemented in java and not Scala in this PR: http://github.com/apache/flink/pull/19983. {code:java} import org.apache.flink.streaming.api.scala.AsyncDataStream import org.apache.flink.streaming.util.retryable.{AsyncRetryStrategies => JAsyncRetryStrategies} val javaAsyncRetryStrategy = new JAsyncRetryStrategies.FixedDelayRetryStrategyBuilder[Int](3, 100L) .build() val data = AsyncDataStream.unorderedWaitWithRetry( source, asyncOperator, pipelineTimeoutInMs, TimeUnit.MILLISECONDS, javaAsyncRetryStrategy ){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)