[jira] [Commented] (FLINK-12173) Optimize "SELECT DISTINCT" into Deduplicate with keep first row

Jim Hughes (Jira) Fri, 09 Aug 2024 13:32:05 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-12173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17872447#comment-17872447
 ]


Jim Hughes commented on FLINK-12173:
------------------------------------

[~jark] This looks like a neat optimization.  Would the approach be to write an 
optimizer rule?

If so, are there any corner cases to consider?  Generally, I can imagine 
optimization rules firing in inappropriate situations and causing trouble. 

> Optimize "SELECT DISTINCT" into Deduplicate with keep first row
> ---------------------------------------------------------------
>
>                 Key: FLINK-12173
>                 URL: https://issues.apache.org/jira/browse/FLINK-12173
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table SQL / Planner
>            Reporter: Jark Wu
>            Priority: Major
>              Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> The following distinct query can be optimized into deduplicate on keys "a, b, 
> c, d" and keep the first row.
> {code:sql}
> SELECT DISTINCT a, b, c, d;
> {code}
> We can optimize this query into Deduplicate to get a better performance than 
> GroupAggregate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-12173) Optimize "SELECT DISTINCT" into Deduplicate with keep first row

Reply via email to