[
https://issues.apache.org/jira/browse/FLINK-21203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jark Wu reassigned FLINK-21203:
-------------------------------
Assignee: wangpeibin
> Don’t collect -U&+U Row When they are equals In the LastRowFunction
> ---------------------------------------------------------------------
>
> Key: FLINK-21203
> URL: https://issues.apache.org/jira/browse/FLINK-21203
> Project: Flink
> Issue Type: Improvement
> Components: Table SQL / Runtime
> Reporter: wangpeibin
> Assignee: wangpeibin
> Priority: Major
>
> In the LastRowFunction , the -U&+U Row will be collected even if they are
> the same, which will increase calculation pressure of the next Operator.
>
> To avoid this, we can optimize the logic of DeduplicateFunctionHelper. Also,
> a config to enable the optimization will be added.
> With the sql followed:
> {quote}select * from
> (select
> *,
> row_number() over (partition by k order by proctime() desc ) as row_num
> from a
> ) t
> where row_num = 1
> {quote}
> Then input 2 row such as :
> {quote}Event("B","1","b"),
> Event("B","1","b")
> {quote}
> Now the output is:
> {quote}(true,+I[B, 1, b, 1])
> (false,-U[B, 1, b, 1])
> (true,+U[B, 1, b, 1])
> {quote}
> After the optimization, the output will be:
> {quote}(true,+I[B, 1, b, 1])
> {quote}
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)