[
https://issues.apache.org/jira/browse/FLINK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825176#comment-17825176
]
Sergey Nuyanzin commented on FLINK-31663:
-----------------------------------------
{quote}
Since the behavior of array_union is already aligned with Spark's(without
duplicates ), if we don't align here, the logic of the entire function would
seem inconsistent. should we change the behavior if array_union. but if we
change it ,it will cause version compatibility problem
{quote}
I don't think we should copy everything that present in Spark.
there is Snowflake, ClickHouse, PostgreSQL, DuckDB and etc.
{{ARRAY_EXCEPT}} keeps duplicates (as in Snowflake) and it allows to cover some
cases not covered by the version eliminating duplicates. In cae there is a need
to eliminate duplicates there is {{ARRAY_DISTINCT}}.
And Flink follows this way
Yep there is {{ARRAY_UNION}} which eliminates duplicates
However there is also {{ARRAY_CONCAT}} which concatenates arrays without
duplicates elimination, moreover it can concatenate more than 2 arrays at once
(like in BigQuery, ClickHouse, DuckDB)
> Add ARRAY_EXCEPT supported in SQL & Table API
> ---------------------------------------------
>
> Key: FLINK-31663
> URL: https://issues.apache.org/jira/browse/FLINK-31663
> Project: Flink
> Issue Type: Sub-task
> Components: Table SQL / API
> Reporter: luoyuxia
> Assignee: Hanyu Zheng
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.20.0
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)